How it works

The boto3 library wraps the AWS S3 API in a Pythonic syntax. The .client() call authenticates with AWS and gives us an object to use to communicate with S3. Make sure you have your keys in environment variables, as otherwise this will not work.

The bucket name must be globally unique. At the time of writing, this bucket is available, but you will likely need to change the name. The .create_bucket() call creates the bucket and sets its ACL. put_object() uses the boto3 upload manager to upload the scraped data into the object in the bucket.

Get Python Web Scraping Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.