O'Reilly logo

Python 3 Text Processing with NLTK 3 Cookbook by Jacob Perkins

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Training a classifier with NLTK-Trainer

In this recipe, we'll cover the train_classifier.py script from NLTK-Trainer, which lets you train NLTK classifiers from the command line. NLTK-Trainer was previously introduced at the end of Chapter 4, Part-of-speech Tagging, and again at the end of Chapter 5, Extracting Chunks.

Note

You can find NLTK-Trainer at https://github.com/japerk/nltk-trainer and the online documentation at http://nltk-trainer.readthedocs.org/.

How to do it...

Like train_tagger.py and train_chunker.py, the only required argument for train_classifier.py is the name of a corpus. The corpus must have a categories() method, because text classification is all about learning to classify categories. Here's an example of running train_classifier.py ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required