A unigram generally refers to a single token. Therefore, a unigram tagger only uses a single word as its context for determining the part-of-speech tag.
UnigramTagger inherits from
NgramTagger, which is a subclass of
ContextTagger, which inherits from
SequentialBackoffTagger. In other words,
UnigramTagger is a context-based tagger whose context is a single word, or unigram.
UnigramTagger can be trained by giving it a list of tagged sentences at initialization.
>>> from nltk.tag import UnigramTagger >>> from nltk.corpus import treebank >>> train_sents = treebank.tagged_sents()[:3000] >>> tagger = UnigramTagger(train_sents) >>> treebank.sents() ['Pierre', 'Vinken', ',', '61', 'years', 'old', ...