3

VOCABULARY ANALYSIS

Vocabulary analysis involves the examination of a document's wording, conducting a surface analysis of language selection, rather than exploring the concepts those words represent. Language use can be a very powerful descriptive tool for characterizing texts: differing topics, authors, and time periods typically exhibit significant stratification in their vocabularies. Frequency histograms can assist in authorship attribution, judging the likelihood a document was written by a particular individual based on similarities in the use of certain words or word classes. Author gender can be explored through patterns in word class use, while readability indexes can estimate the ease with which a passage will be understood. Term ...

Get Data Mining Methods for the Content Analyst now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.