Juxtaposing Latent Social Networks (or #JustinBieber Versus #TeaParty)
One of the most fascinating aspects of data mining is that it affords you the ability to discover new knowledge from existing information. There really is something to be said for the old adage that âknowledge is power,â and itâs especially true in an age where the amount of information available is steadily growing with no indication of decline. As an interesting exercise, letâs see what we can discover about some of the latent social networks that exist in the sea of Twitter data. The basic approach weâll take is to collect some focused data on two or more topics in a specific way by searching on a particular hashtag, and then apply some of the same metrics we coded up in the previous section (where we analyzed Timâs tweets) to get a feel for the similarity between the networks.
Since thereâs no such thing as a âstupid question,â letâs move forward in the spirit of famed economist Steven D. Levitt[33] and ask the question, âWhat do #TeaParty and #JustinBieber have in common?â[34]
Example 5-14 provides a simple mechanism for collecting approximately the most recent 1,500 tweets (the maximum currently returned by the search API) on a particular topic and storing them away in CouchDB. Like other listings youâve seen earlier in this chapter, it includes simple map/reduce logic to incrementally update the tweets in the event that youâd like to run it over a longer period of time to collect ...
Get Mining the Social Web now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.