Unlike most part-of-speech taggers, the
ClassifierBasedTagger class learns from features. That means we can create a
ClassifierChunker class that can learn from both the words and part-of-speech tags, instead of only the part-of-speech tags as the
TagChunker class does.
ClassifierChunker class, we don't want to discard the words from the training sentences as we did in the previous recipe. Instead, to remain compatible with the 2-tuple
(word, pos) format required for training a
ClassiferBasedTagger class, we convert the
(word, pos, iob) 3-tuples from
((word, pos), iob) 2-tuples using the
chunk_trees2train_chunks() function. This code can be found in
from nltk.chunk ...