Chapter 5. Extracting Chunks

In this chapter, we will cover the following recipes:

  • Chunking and chinking with regular expressions
  • Merging and splitting chunks with regular expressions
  • Expanding and removing chunks with regular expressions
  • Partial parsing with regular expressions
  • Training a tagger-based chunker
  • Classification-based chunking
  • Extracting named entities
  • Extracting proper noun chunks
  • Extracting location chunks
  • Training a named entity chunker
  • Training a chunker with NLTK-Trainer

Introduction

Chunk extraction, or partial parsing, is the process of extracting short phrases from a part-of-speech tagged sentence. This is different from full parsing in that we're interested in standalone chunks, or phrases, instead of full parse trees (for more on parse ...

Get Natural Language Processing: Python and NLTK now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.