## With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

No credit card required

## Book Description

2. Foreword
3. Preface to the Second Edition
4. Preface to the First Edition
5. Acknowledgments
6. I. Preliminaries
1. 1. Introduction
2. 2. Overview of the Data Mining Process
1. 2.1. Introduction
2. 2.2. Core Ideas in Data Mining
3. 2.3. Supervised and Unsupervised Learning
4. 2.4. Steps in Data Mining
5. 2.5. Preliminary Steps
6. 2.6. Building a Model: Example with Linear Regression
7. 2.7. Using Excel for Data Mining
8. 2.8. PROBLEMS
7. II. Data Exploration and Dimension Reduction
1. 3. Data Visualization
1. 3.1. Uses of Data Visualization
2. 3.2. Data Examples
3. 3.3. Basic Charts: bar charts, line graphs, and scatterplots
4. 3.4. Multidimensional Visualization
5. 3.5. Specialized Visualizations
6. 3.6. Summary of major visualizations and operations, according to data mining goal
7. 3.7. PROBLEMS
2. 4. Dimension Reduction
1. 4.1. Introduction
2. 4.2. Practical Considerations
3. 4.3. Data Summaries
4. 4.4. Correlation Analysis
5. 4.5. Reducing the Number of Categories in Categorical Variables
6. 4.6. Converting A Categorical Variable to A Numerical Variable
7. 4.7. Principal Components Analysis
8. 4.8. Dimension Reduction Using Regression Models
9. 4.9. Dimension Reduction Using Classification and Regression Trees
10. 4.10. PROBLEMS
8. III. Performance Evaluation
1. 5. Evaluating Classification and Predictive Performance
1. 5.1. Introduction
2. 5.2. Judging Classification Performance
3. 5.3. Evaluating Predictive Performance
4. 5.4. PROBLEMS
9. IV. Prediction and Classification Methods
1. 6. Multiple Linear Regression
1. 6.1. Introduction
2. 6.2. Explanatory versus Predictive Modeling
3. 6.3. Estimating the Regression Equation and Prediction
4. 6.4. Variable Selection in Linear Regression
5. 6.5. PROBLEMS
2. 7. k-Nearest Neighbors (k-NN)
1. 7.1. k-NN Classifier (categorical outcome)
2. 7.2. k-NN for a Numerical Response
3. 7.3. Advantages and Shortcomings of k-NN Algorithms
4. 7.4. PROBLEMS
3. 8. Naive Bayes
1. 8.1. Introduction
2. 8.2. Applying the Full (Exact) Bayesian Classifier
3. 8.3. Advantages and Shortcomings of the Naive Bayes Classifier
4. 8.4. PROBLEMS
4. 9. Classification and Regression Trees
1. 9.1. Introduction
2. 9.2. Classification Trees
3. 9.3. Measures of Impurity
4. 9.4. Evaluating the Performance of a Classification Tree
5. 9.5. Avoiding Overfitting
6. 9.6. Classification Rules from Trees
7. 9.7. Classification Trees for More Than two Classes
8. 9.8. Regression Trees
9. 9.9. Advantages, Weaknesses, and Extensions
10. 9.10. PROBLEMS
5. 10. Logistic Regression
1. 10.1. Introduction
2. 10.2. Logistic Regression Model
3. 10.3. Evaluating Classification Performance
4. 10.4. Example of Complete Analysis: Predicting Delayed Flights
5. 10.5. Appendix: Logistic Regression for Profiling
6. 10.6. PROBLEMS
6. 11. Neural Nets
1. 11.1. Introduction
2. 11.2. Concept and Structure of a Neural Network
3. 11.3. Fitting a Network to Data
4. 11.4. Required User Input
5. 11.5. Exploring the Relationship Between Predictors and Response
6. 11.6. Advantages and Weaknesses of Neural Networks
7. 11.7. PROBLEMS
7. 12. Discriminant Analysis
1. 12.1. Introduction
2. 12.2. Distance of an Observation from a Class
3. 12.3. Fisher's Linear Classification Functions
4. 12.4. Classification Performance of Discriminant Analysis
5. 12.5. Prior Probabilities
6. 12.6. Unequal Misclassification Costs
7. 12.7. Classifying More Than Two Classes
9. 12.9. PROBLEMS
10. V. Mining Relationships Among Records
1. 13. Association Rules
1. 13.1. Introduction
2. 13.2. Discovering Association Rules in Transaction Databases
3. 13.3. Generating Candidate Rules
4. 13.4. Selecting Strong Rules
5. 13.5. Summary
6. 13.6. PROBLEMS
2. 14. Cluster Analysis
1. 14.1. Introduction
2. 14.2. Measuring Distance Between Two Records
3. 14.3. Measuring Distance Between Two Clusters
4. 14.4. Hierarchical (Agglomerative) Clustering
5. 14.5. Nonhierarchical Clustering: The k-Means Algorithm
6. 14.6. PROBLEMS
11. VI. Forecasting Time Series
1. 15. Handling Time Series
1. 15.1. Introduction
2. 15.2. Explanatory versus Predictive Modeling
3. 15.3. Popular Forecasting Methods in Business
4. 15.4. Time Series Components
5. 15.5. Data Partitioning
6. 15.6. PROBLEMS
2. 16. Regression-Based Forecasting
1. 16.1. Model with Trend
2. 16.2. Model with Seasonality
3. 16.3. Model with trend and seasonality
4. 16.4. Autocorrelation and ARIMA Models
5. 16.5. PROBLEMS
3. 17. Smoothing Methods
1. 17.1. Introduction
2. 17.2. Moving Average
3. 17.3. Simple Exponential Smoothing