We've seen how searchers behave and how they interact with search results. We've decided what queries we want our sites to be found for. How do search engines compile these lists?
In the emerging days of the Web, directories were built to help users navigate to various Web sites. Generally, these directories were created by hand—people categorized Web sites so users could browse to what they wanted. As the Web got larger, this effort became more difficult. "Web spiders" were created that "crawled" Web sites. Web spiders, also known as robots, are computer programs that follow links from known Web sites to other Web sites. These robots access those pages, download the contents of those pages (into a storage mechanism gener-cally referred to as an "index"), and add the links found on those pages to their list for later crawling.
While Web crawlers enabled the early search engines to have a larger list of sites than the manual method of collecting sites, they couldn't perform the other manual tasks of figuring out what the pages were about and ranking them in order of which ones were best. These search engines started working on computer programs that would help them do these things as well. For instance, computer programs could catalog all the words on a page to help figure out what those pages were about.
Google's "PageRank" algorithm in 1998 was a big step forward in automatically cataloging ...