Chapter 24. Extracting data with regular expressions

Matija Lah

In most information management domains, such as resource management, sales, production, and accounting, natural language texts are only rarely used as a source of information. In contrast, there are domains, such as content and document management, where natural language texts pretty much represent the principal or sole source of information.

Before this information can be utilized, it needs to be extracted from the source texts. This task can be performed manually by a person reading the source, identifying individual pieces of data, and then copying and pasting them into a data entry application. Fortunately, the highly deterministic nature of this operation allows its automation. ...

Get SQL Server MVP Deep Dives, Volume 2 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.