There's more...

It is also possible to process infinite, scrolling pages using Selenium. The following code is in 06/06_scrape_continuous_twitter.py:

from selenium import webdriverimport timedriver = webdriver.PhantomJS()print("Starting")driver.get("https://twitter.com")scroll_pause_time = 1.5# Get scroll heightlast_height = driver.execute_script("return document.body.scrollHeight")while True:    print(last_height)    # Scroll down to bottom    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")    # Wait to load page    time.sleep(scroll_pause_time)    # Calculate new scroll height and compare with last scroll height    new_height = driver.execute_script("return document.body.scrollHeight")    print(new_height, last_height)    if new_height 

Get Python Web Scraping Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.