Data preprocessing and data analysis

In this section, we will mainly cover data preprocessing and data analysis. As a part of data preprocessing, we are preparing our training dataset. You may be wondering what kind of data preparation I'm talking about, considering we already have the data. Allow me to tell you that we have two different datasets and both datasets are independent. So, we need to merge the DJIA dataset and NYTimes news article dataset in order to get meaningful insights from these datasets. Once we prepare our training dataset, we can train the data using different machine learning (ML) algorithms.

Now let's start the coding to prepare the training dataset. We will be using numpy, csv, JSON, and pandas as our dependency libraries. ...

Get Machine Learning Solutions now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.