Optimizing our classifiers

Let's now focus on our best performing model, logistic regression, and see if we can push its performance a little higher. The best performance for our LR-based model is an accuracy of 0.88312, as seen earlier.

We are using the phrases parameter search and hyperparameter search interchangeably here. This is done to stay consistent with deep learning vocabulary.

We want to select the best performing configuration of our pipeline. Each configuration might be different in small ways, such as when we remove stop words, bigrams, and trigrams, or similar processes. The total number of such configurations can be fairly large, and can sometimes run into the thousands. In addition to manually selecting a few combinations ...

Get Natural Language Processing with Python Quick Start Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.