Chapter 6. Analyzing Text Data

In this chapter, we will cover the following recipes:

  • Preprocessing data using tokenization
  • Stemming text data
  • Converting text to its base form using lemmatization
  • Dividing text using chunking
  • Building a bag-of-words model
  • Building a text classifier
  • Identifying the gender
  • Analyzing the sentiment of a sentence
  • Identifying patterns in text using topic modeling

Introduction

Text analysis and natural language processing (NLP) is an integral part of modern artificial intelligence systems. Computers are good at understanding rigidly-structured data with limited variety. However, when we deal with unstructured free-form text, things begin to get difficult. Developing NLP applications is challenging because computers have a hard time ...

Get Python Machine Learning Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.