O'Reilly logo

Python Text Processing with NLTK 2.0 Cookbook by Jacob Perkins

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Merging and splitting chunks with regular expressions

In t his recipe, we will cover two more rules for chunking. A MergeRule can merge two chunks together based on the end of the first chunk and the beginning of the second chunk. A SplitRule will split a chunk into two based on the specified split pattern.

How to do it...

A Sp litRule is specified with two opposing curly braces surrounded by a pattern on either side. To split a chunk after a noun, you would do <NN.*>}{<.*>. A Merg eRule is specified by flipping the curly braces, and will join chunks where the end of the first chunk matches the left pattern, and the beginning of the next chunk matches the right pattern. To merge two chunks where the first ends with a noun and the second begins ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required