A chunk is a short phrase within a sentence. If you remember sentence diagrams from grade school, they were a tree-like representation of phrases within a sentence. This is exactly what chunks are: subtrees within a sentence tree, and they will be covered in much more detail in Chapter 5, Extracting Chunks. The following is a sample sentence tree with three Noun Phrase (NP) chunks shown as subtrees:
This recipe will cover how to create a corpus with sentences that contain chunks.
The following is an excerpt from the tagged
treebank corpus. It has part-of-speech tags, as in the previous recipe, but it also ...