Starting with basic feature engineering

Before starting to code, we have to load the dataset in Python and also provide Python with all the necessary packages for our project. We will need to have these packages installed on our system (the latest versions should suffice, no need for any specific package version):

  • Numpy
  • pandas
  • fuzzywuzzy
  • python-Levenshtein
  • scikit-learn
  • gensim
  • pyemd
  • NLTK

As we will be using each one of these packages in the project, we will provide specific instructions and tips to install them.

For all dataset operations, we will be using pandas (and Numpy will come in handy, too). To install numpy and pandas:

pip install numpypip install pandas

The dataset can be loaded into memory easily by using pandas and a specialized ...

Get TensorFlow Deep Learning Projects now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.