Chapter 18. SPIDERS
Spiders, also known as web spiders, crawlers, and web walkers, are specialized webbots thatâunlike traditional webbots with well-defined targetsâdownload multiple web pages across multiple websites. As spiders make their way across the Internet, it's difficult to anticipate where they'll go or what they'll find, as they simply follow links they find on previously downloaded pages. Their unpredictability makes spiders fun to write because they act as if they almost have minds of their own.
The best known spiders are those used by the major search engine companies (Google, Yahoo!, and MSN) to identify online content. And while spiders are synonymous with search engines for many people, the potential utility of spiders is ...
Get Webbots, Spiders, and Screen Scrapers now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.