You can interpret the clusters more clearly by viewing them as a dendrogram. Hierarchical clustering results are usually viewed this way, since dendrograms pack a lot of information into a relatively small space. Since the dendrograms will be graphical and saved as JPGs, you'll need to download the Python Imaging Library (PIL), which is available at http://pythonware.com.

This library comes with an installer for Windows and source
distributions for other platforms. More information on downloading and
installing the PIL is available in Appendix A. The PIL makes it very easy to
generate images with text and lines, which is all you'll really need to
construct a dendrogram. Add the `import`

statement to the beginning of *clusters.py*:

from PIL import Image,ImageDraw

The first step is to use a function that returns the total height
of a given cluster. When determining the overall height of the image,
and where to put the various nodes, it's necessary to know their total
heights. If this cluster is an endpoint (i.e., it has no branches), then
its height is 1; otherwise, its height is the sum of the heights of its
branches. This is easily defined as a recursive function, which you can
add to *clusters.py*:

def getheight(clust): # Is this an endpoint? Then the height is just 1 if clust.left==None and clust.right==None: return 1 # Otherwise the height is the same of the heights of # each branch return getheight(clust.left)+getheight(clust.right)

The other thing you need to know ...

Start Free Trial

No credit card required