O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Visual Data Mining: The VisMiner Approach, 2nd Edition

Book Description

A visual approach to data mining.

Data mining has been defined as the search for useful and previously unknown patterns in large datasets, yet when faced with the task of mining a large dataset, it is not always obvious where to start and how to proceed.

This book introduces a visual methodology for data mining demonstrating the application of methodology along with a sequence of exercises using VisMiner. VisMiner has been developed by the author and provides a powerful visual data mining tool enabling the reader to see the data that they are working on and to visually evaluate the models created from the data.

Key features:

  • Presents visual support for all phases of data mining including dataset preparation.

  • Provides a comprehensive set of non-trivial datasets and problems with accompanying software.

  • Features 3-D visualizations of multi-dimensional datasets.

  • Gives support for spatial data analysis with GIS like features.

  • Describes data mining algorithms with guidance on when and how to use.

  • Accompanied by VisMiner, a visual software tool for data mining, developed specifically to bridge the gap between theory and practice.

Visual Data Mining: The VisMiner Approach is designed as a hands-on work book to introduce the methodologies to students in data mining, advanced statistics, and business intelligence courses. This book provides a set of tutorials, exercises, and case studies that support students in learning data mining processes.

In praise of the VisMiner approach:

"What we discovered among students was that the visualization concepts and tools brought the analysis alive in a way that was broadly understood and could be used to make sound decisions with greater certainty about the outcomes"—Dr. James V. Hansen, J. Owen Cherrington Professor, Marriott School, Brigham Young University, USA

"Students learn best when they are able to visualize relationships between data and results during the data mining process. VisMiner is easy to learn and yet offers great visualization capabilities throughout the data mining process. My students liked it very much and so did I." —Dr. Douglas Dean, Assoc. Professor of Information Systems, Marriott School, Brigham Young University, USA

Table of Contents

  1. Cover
  2. Title Page
  3. Copyright
  4. Preface
  5. Acknowledgments
  6. Chapter 1: Introduction
    1. Data Mining Objectives
    2. Introduction to VisMiner
    3. The Data Mining Process
    4. Summary
  7. Chapter 2: Initial Data Exploration and Dataset Preparation Using VisMiner
    1. The Rationale for Visualizations
    2. Tutorial – Using VisMiner
    3. Summary
  8. Chapter 3: Advanced Topics in Initial Exploration and Dataset Preparation Using VisMiner
    1. Missing Values
    2. Summary
  9. Chapter 4: Prediction Algorithms for Data Mining
    1. Decision Trees
    2. Artificial Neural Networks
    3. Support Vector Machines
    4. Summary
  10. Chapter 5: Classification Models in VisMiner
    1. Dataset Preparation
    2. Tutorial – Building and Evaluating Classification Models
    3. Model Evaluation
    4. Prediction Likelihoods
    5. Classification Model Performance
    6. Interpreting the ROC Curve
    7. Classification Ensembles
    8. Model Application
    9. Summary
  11. Chapter 6: Regression Analysis
    1. The Regression Model
    2. Correlation and Causation
    3. Algorithms for Regression Analysis
    4. Assessing Regression Model Performance
    5. Model Validity
    6. Looking Beyond R2
    7. Polynomial Regression
    8. Artificial Neural Networks for Regression Analysis
    9. Dataset Preparation
    10. Tutorial
    11. A Regression Model for Home Appraisal
    12. Modeling with the Right Set of Observations
    13. ANN Modeling
    14. The Advantage of ANN Regression
    15. Top-Down Attribute Selection
    16. Issues in Model Interpretation
    17. Model Validation
    18. Model Application
    19. Summary
  12. Chapter 7: Cluster Analysis
    1. Introduction
    2. Algorithms for Cluster Analysis
    3. Issues with K-Means Clustering Process
    4. Hierarchical Clustering
    5. Measures of Cluster and Clustering Quality
    6. Silhouette Coefficient
    7. Correlation Coefficient
    8. Self-Organizing Maps (SOM)
    9. Self-Organizing Maps in VisMiner
    10. Choosing the Grid Dimensions
    11. Advantages of a 3-D Grid
    12. Extracting Subsets from a Clustering
    13. Summary
  13. Appendix A: VisMiner Reference by Task
    1. Dataset Preparation
    2. Data Exploration
    3. Model Building – Algorithm Application
    4. Model Evaluation
  14. Appendix B: VisMiner Task/Tool Matrix
  15. Appendix C: IP Address Look-up
    1. IP Address for VisSlave When Using One Computer
    2. IP Address for VisSlave When Using Multiple Computers
  16. Index