Introduction 1

The information society in which we live produces a constantly changing flow of diverse types of data which needs be processed quickly and efficiently, be it for professional or leisure purposes. Our capacity to evolve in this society depends more and more on our capacity to find the information that is most suitable for our needs, on our capacity to filter this information so as to extract the main topics, snippets, tendencies and opinions, but also on our capacity to visualize, summarize and translate this information. These various processes raise two important issues: on the one hand, the development of complex mathematical models which fully take into account the data to be processed, and, on the other hand, the development of efficient algorithms associated with these models and capable of processing large quantities of data and of providing practical solutions to the above problems.

In order to meet these requirements, several scientific communities have turned to probabilistic models and statistical methods which allow both richness in modeling and robustness in processing large quantities of data. Such models and methods can furthermore adapt to the evolution of data sources. Scientific communities involved in text processing have not stayed away from this movement, and methods and tools for natural language processing or information retrieval are largely based today on complex statistical models which have been developed over several years.

Students, engineers ...

Get Textual Information Access: Statistical Models now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.