Until now we’ve danced around the practicalities of scraping, discussing the theory, the tools we’ll be using, and how to approach things. Now let’s try to apply that knowledge and extract some information from a site.
For this example, I’m going to extract the names and point totals of the teams in the Premier League, the highest football (that’s soccer in the US) league in England. This information is available on a page on the BBC website.
The league table, as visitors to the website see it (as of February 2015), looks like this:
It’s presented in a pretty table, but that’s not much use to us in our scripts. ...