Designing the Parsing Script

Our webbot’s objective is to download the target web page, parse the price variables, and place the data into an array for processing. The price-monitoring webbot is largely an exercise in parsing data that appears in tables, since useful online data usually appears as such. When tables aren’t used, <div> tags are generally applied and can be parsed in a similar manner.

While we know that the test target for this example won’t change, we don’t know that about targets in the wild. Therefore, we don’t want to be too specific when telling our parsing routines where to look for pricing information. In this example, the parsing script won’t look for data in specific locations; instead, it will look for the desired data relative ...

Get Webbots, Spiders, and Screen Scrapers, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.