Working with Images, Audio, and other Assets

In this chapter, we will cover:

  • Downloading media content on the web
  • Parsing a URL with urllib to get the filename
  • Determining type of content for a URL
  • Determining a file extension from a content type
  • Downloading and saving images to the local file system
  • Downloading and saving images to S3
  • Generating thumbnails for images
  • Taking website screenshots with Selenium
  • Taking a website screenshot with an external service
  • Performing OCR on images with pytessaract
  • Creating a Video Thumbnail
  • Ripping an MP4 video to an MP3

Get Python Web Scraping Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.