6

TOPIC EXTRACTION

Vocabulary analysis examines the surface of a document, measuring only its use of words, without taking into consideration the meaning of those words. A far more powerful technique known as semantic analysis uses sophisticated language models to allow the computer to disambiguate whether a word is a name or an action and to make assumptions about its meaning. While not as accurate as trained human analysts, modern semantic analysis methods offer significant accuracy across many types of text, opening the door to performing deeper analyses at an otherwise impossible scale.

How Machines Process Text

Unlike numeric computation, processing human-generated text presents a number of unique challenges to automation. Computers are ...

Get Data Mining Methods for the Content Analyst now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.