O'Reilly logo

Textual Information Access: Statistical Models by Francois Yvon, Eric Gaussier

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 6

Conditional Random Fields for Information Extraction 1

6.1. Introduction

In Natural Language Processing, the final ideal goal of allowing computers to understand all texts has, little by little, made way for more modest and pragmatic goals, which can be expressed as specific tasks. Information extraction is typically one of these tasks. It aims to identify factual information elements within a document, able to fill the fields of a predefined form. In a way, it aims to fill the gap between the way humans apprehend information, where the understanding of natural languages plays a large part, and the way computers do, in the form of typed data ordered in structured files or in databases. In a review article on the subject, McCallum discusses an information distillation process [MCC 05].

To achieve such a task, several methods have been used. As it is more and more the case for most other natural language engineering tasks, approaches based on statistical models are currently the most efficient. But this is true only when we correctly reformulate the task as an annotation or labeling problem. The best statistical models capable of learning data annotation are conditional random fields (CRFs).

This chapter is thus an opportunity to present the task of information extraction and the statistical labeling models able to handle it. The first two sections concentrate on the task, by discussing its issues and the specific problems posed. The following four sections focus on statistical ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required