Reverse engineering a dynamic web page

So far, we have tried to scrape data from a web page the same way as introduced in Chapter 2, Scraping the Data. However, it did not work because the data is loaded dynamically with JavaScript. To scrape this data, we need to understand how the web page loads this data, a process known as reverse engineering. Continuing the example from the preceding section, in Firebug, if we click on the Console tab and then perform a search, we will see that an AJAX request is made, as shown in this screenshot:

This AJAX data is not only accessible from within the search web page, but can also be downloaded directly, as follows: ...

Get Web Scraping with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Web Scraping with Python by Richard Lawson

Reverse engineering a dynamic web page

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly