How to do it...

This recipe, and most of the others in this chapter, will be presented with iPython in an interactive manner.  But all of the code for each is available in a script file.  The code for this recipe is in 02/01_parsing_html_wtih_bs.py. You can type the following in, or cut and paste from the script file.

Now let's walk through parsing HTML with Beautiful Soup. We start by loading this page into a BeautifulSoup object using the following code, which creates a BeautifulSoup object, loads the content of the page using with requests.get, and loads it into a variable named soup.

In [1]: import requests   ...: from bs4 import BeautifulSoup   ...: html = requests.get("http://localhost:8080/planets.html").text ...: soup = BeautifulSoup(html, ...

Get Python Web Scraping Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.