Chapter 2. Advanced Web

Hacks 21-49

If you’ve just arrived from Chapter 1 and think that you have more than enough information to Google yourself silly, hold on to your hat. Now you’ll put into high gear all that you’ve learned about the ins and outs of Googling.

In this chapter you’ll meander your Google neighborhood, range farther across the Web, dig deeper into individual sites, twist and recombine your queries, squeeze the last drop of results out of every search, and even go beyond the bounds of Google’s index—all without wearing out your fingers.

Because you’ll get your computer to do the lion’s share of the work for you.

This chapter hacks Google programmatically. Through bite-sized programs, we’ll introduce you to the kind of trawling, crawling, and recombination that’s possible with just a few lines of code. And it’s all possible thanks to something called the Google API—that’s Application Programming Interface, or Google for computers.

In April 2002, Google announced an alternate interface to the friendly search box you see on Google.com. They opened up their index to anyone with a little programming know-how and a reasonable amount of patience. Initially, this wasn’t much to write home about. Some of the earliest applications simply Googled and incorporated the results into a web page—so-called Google boxes [Hack #22] . But as more people experimented with the API, the variety of applications grew from the marginally interesting to the seriously useful. And so was born ...

Get Google Hacks, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.