Training a classifier with NLTK-Trainer

In this recipe, we'll cover the train_classifier.py script from NLTK-Trainer, which lets you train NLTK classifiers from the command line. NLTK-Trainer was previously introduced at the end of Chapter 4, Part-of-speech Tagging, and again at the end of Chapter 5, Extracting Chunks.

Note

You can find NLTK-Trainer at https://github.com/japerk/nltk-trainer and the online documentation at http://nltk-trainer.readthedocs.org/.

How to do it...

Like train_tagger.py and train_chunker.py, the only required argument for train_classifier.py is the name of a corpus. The corpus must have a categories() method, because text classification is all about learning to classify categories. Here's an example of running train_classifier.py ...

Get Python 3 Text Processing with NLTK 3 Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.