Chapter 2. Weighted Lists

So, what is a tag cloud? A tag cloud is a specific kind of weighted list. For lack of a standard working definition of weighted list, I'm going to make one up.

Weighted list

n. A list of words or phrases, in which one or more visual features in the list (such as font size) are correlated to some underlying data.

While tag clouds are a specific type of weighted list, not all weighted lists are tag clouds. For example, the list of cities at the popular craigslist web site (Figure 2) is a weighted list because font size is correlated with popularity, but it lacks the random appearance of a tag cloud, due to the arrangement of the cities in a matrix.

Weighted cities list from craigslist

Figure 2. Weighted cities list from craigslist

Another kind of weighted list, one that's even more distant from tag clouds, is that of the statistically improbable phrases (SIPs) and capitalized phrases (CAPs) lists provided by Amazon.com (Figure 3). In the SIP list, word order correlates to the improbability of the phrase, and in the CAP list, to the frequency with which the phrase appears in the book.

Weighted phrase lists from Amazon.com

Figure 3. Weighted phrase lists from Amazon.com

Creating Weighted Lists

There are lots of ways to make weighted lists. Given any list of words or phrases, there are a handful of visual features that you can choose to correlate ...

Get Building Tag Clouds in Perl and PHP now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.