You are previewing Data Mining For Business Intelligence: Concepts, Techniques, and Applications in Microsoft Office Excel® with XLMiner®, Second Edition.
O'Reilly logo
Data Mining For Business Intelligence: Concepts, Techniques, and Applications in Microsoft Office Excel® with XLMiner®, Second Edition

Book Description

Data Mining for Business Intelligence, Second Edition uses real data and actual cases to illustrate the applicability of data mining (DM) intelligence in the development of successful business models. Featuring complimentary access to XLMiner®, the Microsoft Office Excel® add-in, this book allows readers to follow along and implement algorithms at their own speed, with a minimal learning curve. In addition, students and practitioners of DM techniques are presented with hands-on, business-oriented applications. An abundant amount of exercises and examples, now doubled in number in the second edition, are provided to motivate learning and understanding. This book helps readers understand the beneficial relationship that can be established between DM and smart business practices, and is an excellent learning tool for creating valuable strategies and making wiser business decisions. New topics include detailed coverage of visualization (enhanced by Spotfire subroutines) and time series forecasting, among a host of other subject matter.

Table of Contents

  1. Copyright
  2. Foreword
  3. Preface to the Second Edition
  4. Preface to the First Edition
  5. Acknowledgments
  6. I. Preliminaries
    1. 1. Introduction
      1. 1.1. What Is Data Mining?
      2. 1.2. Where Is Data Mining Used?
      3. 1.3. Origins of Data Mining
      4. 1.4. Rapid Growth of Data Mining
      5. 1.5. Why Are There So Many Different Methods?
      6. 1.6. Terminology and Notation
      7. 1.7. Road Maps to This Book
        1. 1.7.1. Order of Topics
    2. 2. Overview of the Data Mining Process
      1. 2.1. Introduction
      2. 2.2. Core Ideas in Data Mining
        1. 2.2.1. Classification
        2. 2.2.2. Prediction
        3. 2.2.3. Association Rules
        4. 2.2.4. Predictive Analytics
        5. 2.2.5. Data Reduction
        6. 2.2.6. Data Exploration
        7. 2.2.7. Data Visualization
      3. 2.3. Supervised and Unsupervised Learning
      4. 2.4. Steps in Data Mining
      5. 2.5. Preliminary Steps
        1. 2.5.1. Organization of Datasets
        2. 2.5.2. Sampling from a Database
        3. 2.5.3. Oversampling Rare Events
        4. 2.5.4. Preprocessing and Cleaning the Data
        5. 2.5.5. Use and Creation of Partitions
      6. 2.6. Building a Model: Example with Linear Regression
        1. 2.6.1. Boston Housing Data
        2. 2.6.2. Modeling Process
      7. 2.7. Using Excel for Data Mining
      8. 2.8. PROBLEMS
  7. II. Data Exploration and Dimension Reduction
    1. 3. Data Visualization
      1. 3.1. Uses of Data Visualization
      2. 3.2. Data Examples
        1. 3.2.1. Example 1: Boston Housing Data
        2. 3.2.2. Example 2: Ridership on Amtrak Trains
      3. 3.3. Basic Charts: bar charts, line graphs, and scatterplots
        1. 3.3.1. Distribution Plots: Boxplots and Histograms
        2. 3.3.2. Heatmaps: Visualizing Correlations and Missing Values
      4. 3.4. Multidimensional Visualization
        1. 3.4.1. Adding Variables: Color, Size, Shape, Multiple Panels, and Animation
        2. 3.4.2. Manipulations: Rescaling, Aggregation and Hierarchies, Zooming, and Panning, and Filtering
        3. 3.4.3. Reference: Trend Lines and Labels
        4. 3.4.4. Scaling up: Large Datasets
        5. 3.4.5. Multivariate Plot: Parallel Coordinates Plot
        6. 3.4.6. Interactive Visualization
      5. 3.5. Specialized Visualizations
        1. 3.5.1. Visualizing Networked Data
        2. 3.5.2. Visualizing Hierarchical Data: Treemaps
        3. 3.5.3. Visualizing Geographical Data: Map Charts
      6. 3.6. Summary of major visualizations and operations, according to data mining goal
        1. 3.6.1. Prediction
        2. 3.6.2. Classification
        3. 3.6.3. Time Series Forecasting
        4. 3.6.4. Unsupervised Learning
      7. 3.7. PROBLEMS
    2. 4. Dimension Reduction
      1. 4.1. Introduction
      2. 4.2. Practical Considerations
        1. 4.2.1. Example 1: House Prices in Boston
      3. 4.3. Data Summaries
        1. 4.3.1. Summary Statistics
        2. 4.3.2. Pivot Tables
      4. 4.4. Correlation Analysis
      5. 4.5. Reducing the Number of Categories in Categorical Variables
      6. 4.6. Converting A Categorical Variable to A Numerical Variable
      7. 4.7. Principal Components Analysis
        1. 4.7.1. Example 2: Breakfast Cereals
        2. 4.7.2. Principal Components
        3. 4.7.3. Normalizing the Data
        4. 4.7.4. Using Principal Components for Classification and Prediction
      8. 4.8. Dimension Reduction Using Regression Models
      9. 4.9. Dimension Reduction Using Classification and Regression Trees
      10. 4.10. PROBLEMS
  8. III. Performance Evaluation
    1. 5. Evaluating Classification and Predictive Performance
      1. 5.1. Introduction
      2. 5.2. Judging Classification Performance
        1. 5.2.1. Benchmark: The Naive Rule
        2. 5.2.2. Class Separation
        3. 5.2.3. Classification Matrix
        4. 5.2.4. Using the Validation Data
        5. 5.2.5. Accuracy Measures
        6. 5.2.6. Cutoff for Classification
        7. 5.2.7. Performance in Unequal Importance of Classes
        8. 5.2.8. Asymmetric Misclassification Costs
        9. 5.2.9. Oversampling and Asymmetric Costs
        10. 5.2.10. Classification Using a Triage Strategy
      3. 5.3. Evaluating Predictive Performance
        1. 5.3.1. Benchmark: The Average
        2. 5.3.2. Prediction Accuracy Measures
      4. 5.4. PROBLEMS
  9. IV. Prediction and Classification Methods
    1. 6. Multiple Linear Regression
      1. 6.1. Introduction
      2. 6.2. Explanatory versus Predictive Modeling
      3. 6.3. Estimating the Regression Equation and Prediction
        1. 6.3.1. Example: Predicting the Price of Used Toyota Corolla Automobiles
      4. 6.4. Variable Selection in Linear Regression
        1. 6.4.1. Reducing the Number of Predictors
        2. 6.4.2. How to Reduce the Number of Predictors
      5. 6.5. PROBLEMS
    2. 7. k-Nearest Neighbors (k-NN)
      1. 7.1. k-NN Classifier (categorical outcome)
        1. 7.1.1. Determining Neighbors
        2. 7.1.2. Classification Rule
        3. 7.1.3. Example: Riding Mowers
        4. 7.1.4. Choosing k
        5. 7.1.5. Setting the Cutoff Value
        6. 7.1.6. k-NN with More Than Two Classes
      2. 7.2. k-NN for a Numerical Response
      3. 7.3. Advantages and Shortcomings of k-NN Algorithms
      4. 7.4. PROBLEMS
    3. 8. Naive Bayes
      1. 8.1. Introduction
        1. 8.1.1. Example 1: Predicting Fraudulent Financial Reporting
      2. 8.2. Applying the Full (Exact) Bayesian Classifier
        1. 8.2.1. Practical Difficulty with the Complete (Exact) Bayes Procedure
        2. 8.2.2. Solution: Naive Bayes
        3. 8.2.3. Example 2: Predicting Fraudulent Financial Reports, Two Predictors
        4. 8.2.4. Example 3: Predicting Delayed Flights
      3. 8.3. Advantages and Shortcomings of the Naive Bayes Classifier
      4. 8.4. PROBLEMS
    4. 9. Classification and Regression Trees
      1. 9.1. Introduction
      2. 9.2. Classification Trees
        1. 9.2.1. Recursive Partitioning
        2. 9.2.2. Example 1: Riding Mowers
      3. 9.3. Measures of Impurity
        1. 9.3.1. Tree Structure
        2. 9.3.2. Classifying a New Observation
      4. 9.4. Evaluating the Performance of a Classification Tree
        1. 9.4.1. Example 2: Acceptance of Personal Loan
      5. 9.5. Avoiding Overfitting
        1. 9.5.1. Stopping Tree Growth: CHAID
        2. 9.5.2. Pruning the Tree
      6. 9.6. Classification Rules from Trees
      7. 9.7. Classification Trees for More Than two Classes
      8. 9.8. Regression Trees
        1. 9.8.1. Prediction
        2. 9.8.2. Measuring Impurity
        3. 9.8.3. Evaluating Performance
      9. 9.9. Advantages, Weaknesses, and Extensions
      10. 9.10. PROBLEMS
    5. 10. Logistic Regression
      1. 10.1. Introduction
      2. 10.2. Logistic Regression Model
        1. 10.2.1. Example: Acceptance of Personal Loan
        2. 10.2.2. Model with a Single Predictor
        3. 10.2.3. Estimating the Logistic Model from Data: Computing Parameter Estimates
        4. 10.2.4. Interpreting Results in Terms of Odds
      3. 10.3. Evaluating Classification Performance
        1. 10.3.1. Variable Selection
        2. 10.3.2. Impact of Single Predictors
      4. 10.4. Example of Complete Analysis: Predicting Delayed Flights
        1. 10.4.1. Data Preprocessing
        2. 10.4.2. Model Fitting and Estimation
        3. 10.4.3. Model Interpretation
        4. 10.4.4. Model Performance
        5. 10.4.5. Variable Selection
      5. 10.5. Appendix: Logistic Regression for Profiling
        1. 10.5.1. Appendix A: Why Linear Regression Is Inappropriate for a Categorical Response
        2. 10.5.2. Appendix B: Evaluating Goodness of Fit
        3. 10.5.3. Appendix C: Logistic Regression for More Than Two Classes
      6. 10.6. PROBLEMS
    6. 11. Neural Nets
      1. 11.1. Introduction
      2. 11.2. Concept and Structure of a Neural Network
      3. 11.3. Fitting a Network to Data
        1. 11.3.1. Example 1: Tiny Dataset
        2. 11.3.2. Computing Output of Nodes
        3. 11.3.3. Preprocessing the Data
        4. 11.3.4. Training the Model
        5. 11.3.5. Example 2: Classifying Accident Severity
        6. 11.3.6. Avoiding Overfitting
        7. 11.3.7. Using the Output for Prediction and Classification
      4. 11.4. Required User Input
      5. 11.5. Exploring the Relationship Between Predictors and Response
      6. 11.6. Advantages and Weaknesses of Neural Networks
      7. 11.7. PROBLEMS
    7. 12. Discriminant Analysis
      1. 12.1. Introduction
        1. 12.1.1. Example 1: Riding Mowers
        2. 12.1.2. Example 2: Personal Loan Acceptance
      2. 12.2. Distance of an Observation from a Class
      3. 12.3. Fisher's Linear Classification Functions
      4. 12.4. Classification Performance of Discriminant Analysis
      5. 12.5. Prior Probabilities
      6. 12.6. Unequal Misclassification Costs
      7. 12.7. Classifying More Than Two Classes
        1. 12.7.1. Example 3: Medical Dispatch to Accident Scenes
      8. 12.8. Advantages and Weaknesses
      9. 12.9. PROBLEMS
  10. V. Mining Relationships Among Records
    1. 13. Association Rules
      1. 13.1. Introduction
      2. 13.2. Discovering Association Rules in Transaction Databases
        1. 13.2.1. Example 1: Synthetic Data on Purchases of Phone Faceplates
      3. 13.3. Generating Candidate Rules
        1. 13.3.1. The Apriori Algorithm
      4. 13.4. Selecting Strong Rules
        1. 13.4.1. Support and Confidence
        2. 13.4.2. Lift Ratio
        3. 13.4.3. Data Format
        4. 13.4.4. Process of Rule Selection
        5. 13.4.5. Interpreting the Results
        6. 13.4.6. Statistical Significance of Rules
        7. 13.4.7. Example 2: Rules for Similar Book Purchases
      5. 13.5. Summary
      6. 13.6. PROBLEMS
    2. 14. Cluster Analysis
      1. 14.1. Introduction
        1. 14.1.1. Example: Public Utilities
      2. 14.2. Measuring Distance Between Two Records
        1. 14.2.1. Euclidean Distance
        2. 14.2.2. Normalizing Numerical Measurements
        3. 14.2.3. Other Distance Measures for Numerical Data
        4. 14.2.4. Distance Measures for Categorical Data
        5. 14.2.5. Distance Measures for Mixed Data
      3. 14.3. Measuring Distance Between Two Clusters
      4. 14.4. Hierarchical (Agglomerative) Clustering
        1. 14.4.1. Minimum Distance (Single Linkage)
        2. 14.4.2. Maximum Distance (Complete Linkage)
        3. 14.4.3. Average Distance (Average Linkage)
        4. 14.4.4. Centroid Distance (Average Group Linkage)
        5. 14.4.5. Ward's Method
        6. 14.4.6. Dendrograms: Displaying Clustering Process and Results
        7. 14.4.7. Validating Clusters
        8. 14.4.8. Limitations of Hierarchical Clustering
      5. 14.5. Nonhierarchical Clustering: The k-Means Algorithm
        1. 14.5.1. Initial Partition into k Clusters
      6. 14.6. PROBLEMS
  11. VI. Forecasting Time Series
    1. 15. Handling Time Series
      1. 15.1. Introduction
      2. 15.2. Explanatory versus Predictive Modeling
      3. 15.3. Popular Forecasting Methods in Business
        1. 15.3.1. Combining Methods
      4. 15.4. Time Series Components
        1. 15.4.1. Example: Ridership on Amtrak Trains
      5. 15.5. Data Partitioning
      6. 15.6. PROBLEMS
    2. 16. Regression-Based Forecasting
      1. 16.1. Model with Trend
        1. 16.1.1. Linear Trend
        2. 16.1.2. Exponential Trend
        3. 16.1.3. Polynomial Trend
      2. 16.2. Model with Seasonality
      3. 16.3. Model with trend and seasonality
      4. 16.4. Autocorrelation and ARIMA Models
        1. 16.4.1. Computing Autocorrelation
        2. 16.4.2. Improving Forecasts by Integrating Autocorrelation Information
        3. 16.4.3. Evaluating Predictability
      5. 16.5. PROBLEMS
    3. 17. Smoothing Methods
      1. 17.1. Introduction
      2. 17.2. Moving Average
        1. 17.2.1. Centered Moving Average for Visualization
        2. 17.2.2. Trailing Moving Average for Forecasting
        3. 17.2.3. Choosing Window Width (w)
      3. 17.3. Simple Exponential Smoothing
        1. 17.3.1. Choosing Smoothing Parameter α
        2. 17.3.2. Relation between Moving Average and Simple Exponential Smoothing
      4. 17.4. Advanced Exponential Smoothing
        1. 17.4.1. Series with a Trend
        2. 17.4.2. Series with a Trend and Seasonality
        3. 17.4.3. Series with Seasonality (No Trend)
      5. 17.5. PROBLEMS
  12. VII. Cases
    1. 18. Cases
      1. 18.1. Charles Book Club
        1. 18.1.1. The Book Industry
        2. 18.1.2. Database Marketing at Charles
        3. 18.1.3. Data Mining Techniques
        4. 18.1.4. Assignment
      2. 18.2. German Credit
        1. 18.2.1. Assignment
      3. 18.3. Tayko Software Cataloger
        1. 18.3.1. Background
        2. 18.3.2. The Mailing Experiment
        3. 18.3.3. Data
        4. 18.3.4. Assignment
      4. 18.4. Segmenting Consumers of Bath Soap
        1. 18.4.1. Business Situation
        2. 18.4.2. Key Problems
        3. 18.4.3. Data
        4. 18.4.4. Measuring Brand Loyalty
        5. 18.4.5. Assignment
        6. 18.4.6. Appendix
      5. 18.5. Direct-Mail Fundraising
        1. 18.5.1. Background
        2. 18.5.2. Data
        3. 18.5.3. Assignment
      6. 18.6. Catalog Cross Selling
        1. 18.6.1. Background
        2. 18.6.2. Assignment
      7. 18.7. Predicting Bankruptcy
        1. 18.7.1. Predicting Corporate Bankruptcy
        2. 18.7.2. Assignment
      8. 18.8. Time Series Case: Forecasting Public Transportation Demand
        1. 18.8.1. Background
        2. 18.8.2. Problem Description
        3. 18.8.3. Available Data
        4. 18.8.4. Assignment Goal
        5. 18.8.5. Assignment
        6. 18.8.6. Tips and Suggested Steps
  13. References