You are previewing Text Mining and Analysis.
O'Reilly logo
Text Mining and Analysis

Book Description

Big data: It's unstructured, it's coming at you fast, and there's lots of it. In fact, the majority of big data is text-oriented, thanks to the proliferation of online sources such as blogs, emails, and social media. However, having big data means little if you can't leverage it with analytics. Now you can explore the large volumes of unstructured text data that your organization has collected with Text Mining and Analysis: Practical Methods, Examples, and Case Studies Using SAS.

This hands-on guide to text analytics using SAS provides detailed, step-by-step instructions and explanations on how to mine your text data for valuable insight. Through its comprehensive approach, you'll learn not just how to analyze your data, but how to collect, cleanse, organize, categorize, explore, and interpret it as well. Text Mining and Analysis also features an extensive set of case studies, so you can see examples of how the applications work with real-world data from a variety of industries.

Text analytics enables you to gain insights about your customers' behaviors and sentiments. Leverage your organization's text data, and use those insights for making better business decisions with Text Mining and Analysis. Big data: It's unstructured, it's coming at you fast, and there's lots of it. In fact, the majority of big data is text-oriented, thanks to the proliferation of online sources such as blogs, emails, and social media.

However, having big data means little if you can't leverage it with analytics. Now you can explore the large volumes of unstructured text data that your organization has collected with Text Mining and Analysis: Practical Methods, Examples, and Case Studies Using SAS.

This hands-on guide to text analytics using SAS provides detailed, step-by-step instructions and explanations on how to mine your text data for valuable insight. Through its comprehensive approach, you'll learn not just how to analyze your data, but how to collect, cleanse, organize, categorize, explore, and interpret it as well. Text Mining and Analysis also features an extensive set of case studies, so you can see examples of how the applications work with real-world data from a variety of industries.

Text analytics enables you to gain insights about your customers' behaviors and sentiments. Leverage your organization's text data, and use those insights for making better business decisions with Text Mining and Analysis.

Table of Contents

  1. About This Book
  2. About The Authors
  3. Acknowledgments
  4. Chapter 1 Introduction to Text Analytics
    1. Overview of Text Analytics
    2. Text Mining Using SAS Text Miner
    3. Information Retrieval
    4. Document Classification
    5. Ontology Management
    6. Information Extraction
    7. Clustering
    8. Trend Analysis
    9. Enhancing Predictive Models Using Exploratory Text Mining
    10. Sentiment Analysis
    11. Emerging Directions
      1. Handling Big (Text) Data
      2. Voice Mining
      3. Real-Time Text Analytics
    12. Summary
    13. References
  5. Chapter 2 Information Extraction Using SAS Crawler
    1. Introduction to Information Extraction and Organization
      1. SAS Crawler
      2. SAS Search and Indexing
      3. SAS Information Retrieval Studio Interface
    2. Web Crawler
      1. Breadth First
      2. Depth First
      3. Web Crawling: Real-World Applications and Examples
    3. Understanding Core Component Servers
      1. Proxy Server
      2. Pipeline Server
    4. Component Servers of SAS Search and Indexing
      1. Indexing Server
      2. Query Server
      3. Query Web Server
      4. Query Statistics Server
      5. SAS Markup Matcher Server
    5. Summary
    6. References
  6. Chapter 3 Importing Textual Data into SAS Text Miner
    1. An Introduction to SAS Enterprise Miner and SAS Text Miner
      1. Data Types, Roles, and Levels in SAS Text Miner
      2. Creating a Data Source in SAS Enterprise Miner
    2. Importing Textual Data into SAS
      1. Importing Data into SAS Text Miner Using the Text Import Node
      2. %TMFILTER Macro
      3. Importing XLS and XML Files into SAS Text Miner
      4. Managing Text Using SAS Character Functions
    3. Summary
    4. References
  7. Chapter 4 Parsing and Extracting Features
    1. Introduction
    2. Tokens and Words
      1. Lemmatization
      2. POS Tags
      3. Parsing Tree
    3. Text Parsing Node in SAS Text Miner
      1. Stemming and Synonyms
      2. Identifying Parts of Speech
      3. Using Start and Stop Lists
      4. Spell Checking
      5. Entities
    4. Building Custom Entities Using SAS Contextual Extraction Studio
    5. Summary
    6. References
  8. Chapter 5 Data Transformation
    1. Introduction
      1. Zipf’s Law
      2. Term-By-Document Matrix
      3. Text Filter Node
      4. Frequency Weightings
      5. Term Weightings
      6. Filtering Documents
      7. Concept Links
    2. Summary
    3. References
  9. Chapter 6 Clustering and Topic Extraction
    1. Introduction
      1. What Is Clustering?
      2. Singular Value Decomposition and Latent Semantic Indexing
      3. Topic Extraction
      4. Scoring
    2. Summary
    3. References
  10. Chapter 7 Content Management
    1. Introduction
      1. Content Categorization
      2. Types of Taxonomy
      3. Statistical Categorizer
      4. Rule-Based Categorizer
      5. Comparison of Statistical versus Rule-Based Categorizers
      6. Determining Category Membership
      7. Concept Extraction
      8. Contextual Extraction
      9. CLASSIFIER Definition
      10. SEQUENCE and PREDICATE_RULE Definitions
      11. Automatic Generation of Categorization Rules Using SAS Text Miner
      12. Differences between Text Clustering and Content Categorization
    2. Summary
    3. Appendix
    4. References
  11. Chapter 8 Sentiment Analysis
    1. Introduction
    2. Basics of Sentiment Analysis
      1. Challenges in Conducting Sentiment Analysis
      2. Unsupervised versus Supervised Sentiment Classification
      3. SAS Sentiment Analysis Studio Overview
    3. Statistical Models in SAS Sentiment Analysis Studio
    4. Rule-Based Models in SAS Sentiment Analysis Studio
      1. SAS Text Miner and SAS Sentiment Analysis Studio
    5. Summary
    6. References
  12. Case Studies
  13. Case Study 1 Text Mining SUGI/SAS Global Forum Paper Abstracts to Reveal Trends
    1. Introduction
      1. Data
      2. Results
      3. Trends
    2. Summary
    3. Instructions for Accessing the Case Study Project
  14. Case Study 2 Automatic Detection of Section Membership for SAS Conference Paper Abstract Submissions
    1. Introduction
      1. Objective
      2. Step-by-Step Instructions
    2. Summary
  15. Case Study 3 Features-based Sentiment Analysis of Customer Reviews
    1. Introduction
      1. Data
      2. Text Mining for Negative App Reviews
      3. Text Mining for Positive App Reviews
      4. NLP Based Sentiment Analysis
    2. Summary
  16. Case Study 4 Exploring Injury Data for Root Causal and Association Analysis
    1. Introduction
      1. Objective
      2. Data Description
    2. Step-by-Step Instructions
      1. Part 1: SAS Text Miner
      2. Part 2: SAS Enterprise Content Categorization
    3. Summary
  17. Case Study 5 Enhancing Predictive Models Using Textual Data
    1. Data Description
    2. Step-by-Step Instructions
    3. Summary
  18. Case Study 6 Opinion Mining of Professional Drivers’ Feedback
    1. Introduction
      1. Data
      2. Analysis Using SAS<sup xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops" xmlns:ns2="http://www.w3.org/2001/10/synthesis">&#174;</sup> Text Miner Text Miner
      3. Analysis Using the Text Rule-builder Node
    2. Summary
  19. Case Study 7 Information Organization and Access of Enron Emails to Help Investigation
    1. Introduction
      1. Objective
      2. Step-by-Step Software Instruction with Settings/Properties
    2. Summary
  20. Case Study 8 Unleashing the Power of Unified Text Analytics to Categorize Call Center Data
    1. Introduction
      1. Data Description
      2. Examining Topics
      3. Merging or Splitting Topics
      4. Categorizing Content
      5. Concept Map Visualization
      6. Using PROC DS2 for Deployment DEPLOYMENT
    2. Integrating with SAS<sup xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops" xmlns:ns2="http://www.w3.org/2001/10/synthesis">&#174;</sup> Visual Analytics Visual Analytics
    3. Summary
  21. Case Study 9 Evaluating Health Provider Service Performance Using Textual Responses
    1. Introduction
    2. Summary
  22. Index