O'Reilly logo

Programming Collective Intelligence by Toby Segaran

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Basic Linear Classification

This is one of the simplest classifiers to construct, but it's a good basis for further work. It works by finding the average of all the data in each class and constructing a point that represents the center of the class. It can then classify new points by determining to which center point they are closest.

To do this, you'll first need a function that calculates the average point in the classes. In this case, the classes are just 0 and 1. Add lineartrain to advancedclassify.py:

def lineartrain(rows):
  averages={}
  counts={}

  for row in rows:
    # Get the class of this point
    cl=row.match

    averages.setdefault(cl,[0.0]*(len(row.data)))
    counts.setdefault(cl,0)

    # Add this point to the averages
    for i in range(len(row.data)):
      averages[cl][i]+=float(row.data[i])

    # Keep track of how many points in each class
    counts[cl]+=1

  # Divide sums by counts to get the averages
  for cl,avg in averages.items(  ):
    for i in range(len(avg)):
      avg[i]/=counts[cl]

  return averages

You can run this function in your Python session to get the averages:

>>>reload(advancedclassify)
<module 'advancedclassify' from 'advancedclassify.pyc'>
>>> avgs=advancedclassify.lineartrain(agesonly)

To see why this is useful, consider again the plot of the age data, shown in Figure 9-4.

Linear classifier using averages

Figure 9-4. Linear classifier using averages

The Xs in the figure represent the average points as calculated by lineartrain. The line ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required