You are previewing Data Mining: Concepts and Techniques.
O'Reilly logo
Data Mining: Concepts and Techniques

Book Description

The increasing volume of data in modern business and science calls for more complex and sophisticated tools. Although advances in data mining technology have made extensive data collection much easier, it’s still always evolving and there is a constant need for new techniques and tools that can help us transform this data into useful information and knowledge.

Since the previous edition’s publication, great advances have been made in the field of data mining. Not only does the third of edition of Data Mining: Concepts and Techniques continue the tradition of equipping you with an understanding and application of the theory and practice of discovering patterns hidden in large data sets, it also focuses on new, important topics in the field: data warehouses and data cube technology, mining stream, mining social networks, and mining spatial, multimedia and other complex data. Each chapter is a stand-alone guide to a critical topic, presenting proven algorithms and sound implementations ready to be used directly or with strategic modification against live data. This is the resource you need if you want to apply today’s most powerful data mining techniques to meet real business challenges.



    * Presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects. * Addresses advanced topics such as mining object-relational databases, spatial databases, multimedia databases, time-series databases, text databases, the World Wide Web, and applications in several fields. *Provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data

Table of Contents

  1. Cover image
  2. Title page
  3. Table of Contents
  4. Front Matter
  5. Copyright
  6. Dedication
  7. Foreword
  8. Foreword to Second Edition
  9. Preface
    1. Organization of the Book
    2. To the Instructor
    3. To the Student
    4. To the Professional
    5. Book Web Sites with Resources
  10. Acknowledgments
    1. Third Edition of the Book
    2. Second Edition of the Book
    3. First Edition of the Book
  11. About the Authors
  12. 1. Introduction
    1. Publisher Summary
    2. 1.1 Why Data Mining?
    3. 1.2 What Is Data Mining?
    4. 1.3 What Kinds of Data Can Be Mined?
    5. 1.4 What Kinds of Patterns Can Be Mined?
    6. 1.5 Which Technologies Are Used?
    7. 1.6 Which Kinds of Applications Are Targeted?
    8. 1.7 Major Issues in Data Mining
    9. 1.8 Summary
    10. 1.9 Exercises
    11. 1.10 Bibliographic Notes
  13. 2. Getting to Know Your Data
    1. Publisher Summary
    2. 2.1 Data Objects and Attribute Types
    3. 2.2 Basic Statistical Descriptions of Data
    4. 2.3 Data Visualization
    5. 2.4 Measuring Data Similarity and Dissimilarity
    6. 2.5 Summary
    7. 2.6 Exercises
    8. 2.7 Bibliographic Notes
  14. 3. Data Preprocessing
    1. Publisher Summary
    2. 3.1 Data Preprocessing: An Overview
    3. 3.2 Data Cleaning
    4. 3.3 Data Integration
    5. 3.4 Data Reduction
    6. 3.5 Data Transformation and Data Discretization
    7. 3.6 Summary
    8. 3.7 Exercises
    9. 3.8 Bibliographic Notes
  15. 4. Data Warehousing and Online Analytical Processing
    1. Publisher Summary
    2. 4.1 Data Warehouse: Basic Concepts
    3. 4.2 Data Warehouse Modeling: Data Cube and OLAP
    4. 4.3 Data Warehouse Design and Usage
    5. 4.4 Data Warehouse Implementation
    6. 4.5 Data Generalization by Attribute-Oriented Induction
    7. 4.6 Summary
    8. 4.7 Exercises
    9. Bibliographic Notes
  16. 5. Data Cube Technology
    1. Publisher Summary
    2. 5.1 Data Cube Computation: Preliminary Concepts
    3. 5.2 Data Cube Computation Methods
    4. 5.3 Processing Advanced Kinds of Queries by Exploring Cube Technology
    5. 5.4 Multidimensional Data Analysis in Cube Space
    6. 5.5 Summary
    7. 5.6 Exercises
    8. 5.7 Bibliographic Notes
  17. 6. Mining Frequent Patterns, Associations, and Correlations: Basic Concepts and Methods
    1. Publisher Summary
    2. 6.1 Basic Concepts
    3. 6.2 Frequent Itemset Mining Methods
    4. 6.3 Which Patterns Are Interesting?—Pattern Evaluation Methods
    5. 6.4 Summary
    6. 6.5 Exercises
    7. 6.6 Bibliographic Notes
  18. 7. Advanced Pattern Mining
    1. Publisher Summary
    2. 7.1 Pattern Mining: A Road Map
    3. 7.2 Pattern Mining in Multilevel, Multidimensional Space
    4. 7.3 Constraint-Based Frequent Pattern Mining
    5. 7.4 Mining High-Dimensional Data and Colossal Patterns
    6. 7.5 Mining Compressed or Approximate Patterns
    7. 7.6 Pattern Exploration and Application
    8. 7.7 Summary
    9. 7.8 Exercises
    10. 7.9 Bibliographic Notes
  19. 8. Classification: Basic Concepts
    1. Publisher Summary
    2. 8.1 Basic Concepts
    3. 8.2 Decision Tree Induction
    4. 8.3 Bayes Classification Methods
    5. 8.4 Rule-Based Classification
    6. 8.5 Model Evaluation and Selection
    7. 8.6 Techniques to Improve Classification Accuracy
    8. 8.7 Summary
    9. 8.8 Exercises
    10. 8.9 Bibliographic Notes
  20. 9. Classification: Advanced Methods
    1. Publisher Summary
    2. 9.1 Bayesian Belief Networks
    3. 9.2 Classification by Backpropagation
    4. 9.3 Support Vector Machines
    5. 9.4 Classification Using Frequent Patterns
    6. 9.5 Lazy Learners (or Learning from Your Neighbors)
    7. 9.6 Other Classification Methods
    8. 9.7 Additional Topics Regarding Classification
    9. Summary
    10. 9.9 Exercises
    11. 9.10 Bibliographic Notes
  21. 10. Cluster Analysis: Basic Concepts and Methods
    1. Publisher Summary
    2. 10.1 Cluster Analysis
    3. 10.2 Partitioning Methods
    4. 10.3 Hierarchical Methods
    5. 10.4 Density-Based Methods
    6. 10.5 Grid-Based Methods
    7. 10.6 Evaluation of Clustering
    8. 10.7 Summary
    9. 10.8 Exercises
    10. 10.9 Bibliographic Notes
  22. 11. Advanced Cluster Analysis
    1. Publisher Summary
    2. 11.1 Probabilistic Model-Based Clustering
    3. 11.2 Clustering High-Dimensional Data
    4. 11.3 Clustering Graph and Network Data
    5. 11.4 Clustering with Constraints
    6. Summary
    7. 11.6 Exercises
    8. 11.7 Bibliographic Notes
  23. 12. Outlier Detection
    1. Publisher Summary
    2. 12.1 Outliers and Outlier Analysis
    3. 12.2 Outlier Detection Methods
    4. 12.3 Statistical Approaches
    5. 12.4 Proximity-Based Approaches
    6. 12.5 Clustering-Based Approaches
    7. 12.6 Classification-Based Approaches
    8. 12.7 Mining Contextual and Collective Outliers
    9. 12.8 Outlier Detection in High-Dimensional Data
    10. 12.9 Summary
    11. 12.10 Exercises
    12. 12.11 Bibliographic Notes
  24. 13. Data Mining Trends and Research Frontiers
    1. Publisher Summary
    2. 13.1 Mining Complex Data Types
    3. 13.2 Other Methodologies of Data Mining
    4. 13.3 Data Mining Applications
    5. 13.4 Data Mining and Society
    6. 13.5 Data Mining Trends
    7. 13.6 Summary
    8. 13.7 Exercises
    9. 13.8 Bibliographic Notes
  25. Bibliography
  26. Index