Preface

Search engines for large collections of data preceded the World Wide Web by decades. There were those massive library catalogs, hand-typed with painstaking precision on index cards and eventually, to varying degrees, automated. There were the large data collections of professional information companies such as Dialog and LexisNexis. Then there are the still-extant private, expensive medical, real estate, and legal search services.

Those data collections were not always easy to search, but with a little finesse and a lot of patience, it was always possible to search them thoroughly. Information was grouped according to established ontologies, data preformatted according to particular guidelines.

Then came the Web.

Information on the Web—as anyone knows who’s ever looked at half-a-dozen web pages knows—is not all formatted the same way. Nor is it necessarily particularly accurate. Nor up to date. Nor spellchecked. Nonetheless, search engines cropped up, trying to make sense of the rapidly-increasing index of information online. Eventually, special syntaxes were added for searching common parts of the average web page (such as title or URL). Search engines evolved rapidly, trying to encompass all the nuances of the billions of documents online, and they still continue to evolve today.

Google ™ threw its hat into the ring in 1998. The second incarnation of a search engine service known as BackRub, the name “Google” was a play on the word “googol ,” a one followed by a hundred zeros. From the beginning, Google was different from the other major search engines online—AltaVista, Excite, HotBot, and others.

Was it the technology? Partially. The relevance of Google’s search results was outstanding and worthy of comment. But more than that, Google’s focus and more human face made it stand out online.

With its friendly presentation and its constantly expanding set of options, it’s no surprise that Google continues to get lots of fans. There are weblogs devoted to it. Search engine newsletters, such as ResearchBuzz, spend a lot of time covering Google. Legions of devoted fans spend lots of time uncovering undocumented features, creating games (like Google whacking) and even coining new words (like “Googling,” the practice of checking out a prospective date or hire via Google’s search engine.)

In April 2002, Google reached out to its fan base by offering the Google API. The Google API gives developers a legal way to access the Google search results with automated queries (any other way of accessing Google’s search results with automated software is against Google’s Terms of Service.)

Why Google Hacks?

“Hacks” are generally considered to be “quick-n-dirty” solutions to programming problems or interesting techniques for getting a task done. But what does this kind of hacking have to do with Google?

Considering the size of the Google index, there are many times when you might want to do a particular kind of search and you get too many results for the search to be useful. Or you may want to do a search that the current Google interface does not support.

The idea of Google Hacks is not to give you some exhaustive manual of how every command in the Google syntax works, but rather to show you some tricks for making the best use of a search and show applications of the Google API that perform searches that you can’t perform using the regular Google interface. In other words, hacks.

Dozens of programs and interfaces have sprung up from the Google API. Both games and serious applications using Google’s database of web pages are available from everybody from the serious programmer to the devoted fan (like me).

Get Google Hacks now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.