Machine learning for text

There are at least 10 to 20 machine learning techniques that are well known in the community, ranging from SVMs to several regressions and gradient boosting machines. We will select a small taste of these. 

Source: https://www.kaggle.com/surveys/2017.

The preceding graph shows the most popular machine learning techniques used by Kagglers.

We met Logistic Regression in the first chapter while working the 20 newsgroups dataset. We will revisit Logistic Regression and introduce Naive Bayes, SVM, Decision Trees, Random Forests, and XgBoost. XgBoost is a popular algorithm used by several Kaggle winners to achieve award-winning ...

Get Natural Language Processing with Python Quick Start Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.