HOW SEARCH ENGINES WORK

The major search engines that account for most market share today are the automated type: Google, Yahoo!, and Microsoft Bing2 (although Yahoo! now uses Bing’s search index).3

Google and Bing, as well as other smaller search engines in the United States that exist and operate their own technology, and search engines throughout the world, have similar overall infrastructure as follows:

  • Web crawlers crawl the Web. These crawlers follow links to discover the pages on the Web.
  • Extraction processes gather information from those pages (such as textual content, metadata, and links).
  • Indices stores the content from web pages. Content is generally stored using word-based keys, similar to the index in a book. When you look up a word in the index of a book, you learn the page number that word is on. Similarly, with a search engine index, the search engine can look up a word that someone is searching for and find out all the web pages associated with that word.
  • Results are scored to determine what pages are the most relevant for each search. When someone does a search (called a query) and the search engine checks the index for all the web pages associated with that search, the search engine needs a way to rank those web pages in an order that is useful for the searcher. Search engines use a number of factors in scoring, and these factors are adjusted all of the time based on new algorithms, tests, and other criteria. Search engines keep the details of these scoring ...

Get Marketing in the Age of Google: Your Online Strategy IS Your Business Strategy, Revised and Updated now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.