O'Reilly logo

Natural Language Processing with Python by Edward Loper, Steven Bird, Ewan Klein

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Mapping Words to Properties Using Python Dictionaries

As we have seen, a tagged word of the form (word, tag) is an association between a word and a part-of-speech tag. Once we start doing part-of-speech tagging, we will be creating programs that assign a tag to a word, the tag which is most likely in a given context. We can think of this process as mapping from words to tags. The most natural way to store mappings in Python uses the so-called dictionary data type (also known as an associative array or hash array in other programming languages). In this section, we look at dictionaries and see how they can represent a variety of language information, including parts-of-speech.

Indexing Lists Versus Dictionaries

A text, as we have seen, is treated in Python as a list of words. An important property of lists is that we can “look up” a particular item by giving its index, e.g., text1[100]. Notice how we specify a number and get back a word. We can think of a list as a simple kind of table, as shown in Figure 5-2.

List lookup: We access the contents of a Python list with the help of an integer index.

Figure 5-2. List lookup: We access the contents of a Python list with the help of an integer index.

Contrast this situation with frequency distributions (Computing with Language: Simple Statistics), where we specify a word and get back a number, e.g., fdist['monstrous'], which tells us the number of times a given word has occurred in a text. Lookup using words is familiar to anyone ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required