Creating the training dataset

Let's now create the training set for the chatbot. We'd need all the conversations between the characters in the correct order: fortunately, the corpora contains more than what we actually need. For creating the dataset, we will start by downloading the zip archive, if it's not already on disk. We'll then decompress the archive in a temporary folder (if you're using Windows, that should be C:\Temp), and we will read just the movie_lines.txt and the movie_conversations.txt files, the ones we really need to create a dataset of consecutive utterances.

Let's now go step by step, creating multiple functions, one for each step, in the file corpora_downloader.py. The first function we need is to retrieve the file from ...

Get TensorFlow Deep Learning Projects now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.