O'Reilly logo

Text Processing with Ruby by Rob Miller

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Searching the Document

When we’re scraping a web page, we’re generally interested in a small part of it. But a document contains so much information that we need some way of telling Nokogiri which particular bit of the page we’re interested in. As humans, we do this in a visual way. We might look at a page and see a table, grasping its contents from the title above it. We’d scan down the rows to see the particular record we’re interested in, and then across the columns to find the particular data value we were looking for. At no point do we have anything more than a vague appreciation for the structure of the page we’re viewing; it doesn’t matter to us what how the document is represented in HTML.

But, predictably, that’s not how a computer ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required