After collecting data about the things people like, you need a way
to determine how similar people are in their tastes. You do this by
comparing each person with every other person and calculating a
*similarity score*. There are a few ways to do this,
and in this section I'll show you two systems for calculating similarity
scores: *Euclidean distance* and *Pearson
correlation*.

One very simple way to calculate a similarity score is to use a Euclidean distance score, which takes the items that people have ranked in common and uses them as axes for a chart. You can then plot the people on the chart and see how close together they are, as shown in Figure 2-1.

Figure 2-1. People in preference space

This figure shows the people charted in *preference
space*. Toby has been plotted at 4.5 on the Snakes axis and
at 1.0 on the Dupree axis. The closer two people are in the preference
space, the more similar their preferences are. Because the chart is
two-dimensional, you can only look at two rankings at a time, but the
principle is the same for bigger sets of rankings.

To calculate the distance between Toby and LaSalle in the chart,
take the difference in each axis, square them and add them together,
then take the square root of the sum. In Python, you can use the
`pow(n,2)`

function to square a
number and take the square root with the `sqrt`

function:

>>`from math ...`

Start Free Trial

No credit card required