Loading the data

I have always liked The Adventures of Sherlock Holmes by Sir Arthur Conan Doyle. Let's download the book and save it locally:

url = 'http://www.gutenberg.org/ebooks/1661.txt.utf-8'file_name = 'sherlock.txt'

Let's actually download the file. You only need to do this once, but this download utility can be used whenever you are downloading other datasets, too:

import urllib.request# Download the file from `url` and save it locally under `file_name`:with urllib.request.urlopen(url) as response:    with open(file_name, 'wb') as out_file:        data = response.read() # a `bytes` object        out_file.write(data)

Moving on, let's check whether we got the correct file in place with shell syntax inside our Jupyter notebook. This ability to run ...

Get Natural Language Processing with Python Quick Start Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.