O'Reilly logo

Data Architecture: A Primer for the Data Scientist by Dan Linstedt, W.H. Inmon

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

2.6

Textual Disambiguation

Abstract

Unstructured nonrepetitive data is contextualized by means of a process called textual disambiguation. It is only after textual disambiguation that unstructured nonrepetitive data is able to be analyzed. Textual disambiguation is sometimes called textual ETL. Textual disambiguation consists of many different algorithms. The two most prominent algorithms are document fracturing and named value process (sometimes called inline contextualization). The process of identifying documents that need to be processed through textual disambiguation is preceded by the mapping process. The iterative approach is the way that documents are normally processed. Another form of disambiguation is that of report decompilation. ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required