Named Entity Recognition

At the start of this chapter, we briefly introduced named entities (NEs). Named entities are definite noun phrases that refer to specific types of individuals, such as organizations, persons, dates, and so on. Table 7-3 lists some of the more commonly used types of NEs. These should be self-explanatory, except for “FACILITY”: human-made artifacts in the domains of architecture and civil engineering; and “GPE”: geo-political entities such as city, state/province, and country.

Table 7-3. Commonly used types of named entity

NE type

Examples

ORGANIZATION

Georgia-Pacific Corp., WHO

PERSON

Eddy Bonte, President Obama

LOCATION

Murray River, Mount Everest

DATE

June, 2008-06-29

TIME

two fifty a m, 1:30 p.m.

MONEY

175 million Canadian Dollars, GBP 10.40

PERCENT

twenty pct, 18.75 %

FACILITY

Washington Monument, Stonehenge

GPE

South East Asia, Midlothian

The goal of a named entity recognition (NER) system is to identify all textual mentions of the named entities. This can be broken down into two subtasks: identifying the boundaries of the NE, and identifying its type. While named entity recognition is frequently a prelude to identifying relations in Information Extraction, it can also contribute to other tasks. For example, in Question Answering (QA), we try to improve the precision of Information Retrieval by recovering not whole pages, but just those parts which contain an answer to the user’s question. Most QA systems take the documents returned by standard Information Retrieval, and ...

Get Natural Language Processing with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.