Chapter 10. Case Study – Text Analytics

So far, we have focused on deriving insights and building models on top of data that has a well defined and fixed structure. Data sources such as delimited files and database tables have a fixed format and are called structured sources of data. Structured data is the mainstay of analytics, and most of the use cases we discussed rely on structured data. Data sources such as social media posts, support case comments, e-mails, articles, and so on are called unstructured, data and they can contain business insights about customers and products that is not readily available in structured data. For example, structured information such as product usage tables can tell us that a particular customer is not using ...

Get Learning Apache Mahout now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.