Chapter 3. THE PROBLEM OF WEB NAVIGATION

We never, ever in the history of mankind have had access to so much information so quickly and so easily.

Vint Cerf, Father of the Internet

Web navigation, also known as surfing, involves browsing web pages and clicking on hyperlinks. Combined with the use of search engines this activity dominates information seeking. To support surfing, the navigation problem of "getting lost in hyperspace" must be dealt with. We argue that machine learning is a technology that can be used to improve user interaction, and that Markov chains are a natural model of web user navigation.

CHAPTER OBJECTIVES

  • Explain how the navigation problem arises when we surf the web.

  • Motivate the use of machine learning algorithms for developing technologies that can adapt to users' behavior and improve web interaction.

  • Introduce the naive Bayes classifier and its application to automatic classification of web pages.

  • Introduce the notion of trails on the web, and argue that trails should be first class objects that are supported by the tools we use to interact with the web.

  • Introduce the notion of a Markov chain in the context of the web.

  • Explain how Markov chain probabilities can be used to reason about surfers' navigation behavior.

  • Explain how Markov chain probabilities can be used to measure the relevance of links to surfers.

  • Explain the potential conflict between the objectives of the web site owner and users visiting the site.

  • Explain the potential conflict between the navigability ...

Get An Introduction to Search Engines and Web Navigation now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.