At the start of this chapter, we briefly introduced named entities (NEs). Named entities are definite noun phrases that refer to specific types of individuals, such as organizations, persons, dates, and so on. Table 7-3 lists some of the more commonly used types of NEs. These should be self-explanatory, except for “FACILITY”: human-made artifacts in the domains of architecture and civil engineering; and “GPE”: geo-political entities such as city, state/province, and country.
Table 7-3. Commonly used types of named entity
Georgia-Pacific Corp., WHO
Eddy Bonte, President Obama
Murray River, Mount Everest
two fifty a m, 1:30 p.m.
175 million Canadian Dollars, GBP 10.40
twenty pct, 18.75 %
Washington Monument, Stonehenge
South East Asia, Midlothian
The goal of a named entity recognition (NER) system is to identify all textual mentions of the named entities. This can be broken down into two subtasks: identifying the boundaries of the NE, and identifying its type. While named entity recognition is frequently a prelude to identifying relations in Information Extraction, it can also contribute to other tasks. For example, in Question Answering (QA), we try to improve the precision of Information Retrieval by recovering not whole pages, but just those parts which contain an answer to the user’s question. Most QA systems take the documents returned by standard Information Retrieval, and ...