O'Reilly logo

Learning Scrapy by Dimitrios Kouzis-Loukas

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Selecting HTML elements with XPath

If you come from a traditional software engineering background, and have no knowledge of XPath, you will probably worry that, in order to access this information in HTML documents, you will have to do lots of string matching, searching for tags on the document, handling special cases, and so on, or somehow parse the entire tree representation to extract what you want. The good news is that none of those is necessary. You can select and extract elements, attributes, and text with a language called XPath, specially designed for that purpose.

In order to use XPath with Google Chrome, click on the Console tab of Developer Tools and use the $x utility function. For example, you can try $x('//h1') on http://example.com/ ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required