Chapter 2. Microformats: Semantic Markup and Common Sense Collide

In terms of the Web’s ongoing evolution, microformats are an important step forward because they provide an effective mechanism for embedding “smarter data” into web pages and are easy for content authors to implement. Put succinctly, microformats are simply conventions for unambiguously including structured data into web pages in an entirely value-added way. This chapter begins by briefly introducing the microformats landscape and then digs right into some examples involving specific uses of the XFN (XHTML Friends Network), geo, hRecipe, and hReview microformats. In particular, we’ll mine human relationships out of blogrolls, extract coordinates from web pages, parse out recipes from foodnetwork.com, and analyze reviews on some of those recipes. The example code listings in this chapter aren’t implemented with the intention of being “full spec parsers,” but should be more than enough to get you on your way.

Although it might be somewhat of a stretch to call data decorated with microformats like geo or hRecipe “social data,” it’s still interesting and will inevitably play an increased role in social data mashups. At the time this book was written, nearly half of all web developers reported some use of microformats, the microformats.org community had just celebrated its fifth birthday, and Google reported that 94% of the time, microformats are involved in Rich Snippets. If Google has anything to ...

Get Mining the Social Web now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.