A chunk is a short phrase within a sentence. If you remember sentence diagrams from grade school, they were a tree-like representation of phrases within a sentence. This is exactly what chunks are: sub-trees within a sentence tree, and they will be covered in much more detail in Chapter 5, Extracting Chunks. Following is a sample sentence tree with three noun phrase (NP) chunks shown as sub-trees.
This recipe will cover how to create a corpus with sentences that contain chunks.
Here is an excerpt from the tagged
treebank corpus. It has part-of-speech tags, as in the previous recipe, but it also has square ...