O'Reilly logo

Natural Language Processing with Java and LingPipe Cookbook by Krishna Dayanidhi, Breck Baldwin

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Conditional random fields (CRF) for word/token tagging

Conditional random fields (CRF) are an extension of the Logistic regression recipe in Chapter 3, Advanced Classifiers, but are applied to word tagging. At the end of Chapter 1, Simple Classifiers, we discussed various ways to encode a problem into a classification problem. CRFs treat the sequence tagging problem as finding the best category where each category (C) is one of the C*T tag (T) assignments to tokens.

For example, if we have the tokens The and rain and tag d for determiner and n for noun, then the set of categories for the CRF classifier are:

  • Category 1: d d
  • Category 2: n d
  • Category 3: n n
  • Category 4: d d

Various optimizations are applied to keep this combinatoric nightmare computable, ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required