Getting ready

StackOverflow actually makes it quite easy to scrape data from their pages.  We are going to use content from a posting at https://stackoverflow.com/jobs/122517/spacex-enterprise-software-engineer-full-stack-spacex?so=p&sec=True&pg=1&offset=22&cl=Amazon%3b+.  This likely will not be available at the time you read it, so I've included the HTML of this page in the 07/spacex-job-listing.html file, which we will use for the examples in this chapter.

StackOverflow job listings pages are very structured.  It's probably because they're created by programmers and for programmers.  The page (at the time of writing) looks like the following:

A StackOverflow job listing

All of this information is codified within the HTML of the page.  ...

Get Python Web Scraping Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.