You are previewing Combining Pattern Classifiers: Methods and Algorithms, 2nd Edition.
O'Reilly logo
Combining Pattern Classifiers: Methods and Algorithms, 2nd Edition

Book Description

A unified, coherent treatment of current classifier ensemble methods, from fundamentals of pattern recognition to ensemble feature selection, now in its second edition

The art and science of combining pattern classifiers has flourished into a prolific discipline since the first edition of Combining Pattern Classifiers was published in 2004. Dr. Kuncheva has plucked from the rich landscape of recent classifier ensemble literature the topics, methods, and algorithms that will guide the reader toward a deeper understanding of the fundamentals, design, and applications of classifier ensemble methods.

Thoroughly updated, with MATLAB code and practice data sets throughout, Combining Pattern Classifiers includes:

  • Coverage of Bayes decision theory and experimental comparison of classifiers

  • Essential ensemble methods such as Bagging, Random forest, AdaBoost, Random subspace, Rotation forest, Random oracle, and Error Correcting Output Code, among others

  • Chapters on classifier selection, diversity, and ensemble feature selection

  • With firm grounding in the fundamentals of pattern recognition, and featuring more than 140 illustrations, Combining Pattern Classifiers, Second Edition is a valuable reference for postgraduate students, researchers, and practitioners in computing and engineering.

    Table of Contents

    1. Preface
      1. The Playing Field
      2. Software
      3. Structure and What is New in the Second Edition
      4. Who is This Book For?
      5. Notes
    2. Acknowledgements
    3. 1 Fundamentals of Pattern Recognition
      1. 1.1 Basic Concepts: Class, Feature, Data Set
      2. 1.2 Classifier, Discriminant Functions, Classification Regions
      3. 1.3 Classification Error and Classification Accuracy
      4. 1.4 Experimental Comparison of Classifiers
      5. 1.5 Bayes Decision Theory
      6. 1.6 Clustering and Feature Selection
      7. 1.7 Challenges of Real-Life Data
      8. Appendix
      9. 1.A.1 Data Generation
      10. 1.A.2 Comparison of Classifiers
      11. 1.A.3 Feature Selection
      12. Notes
    4. 2 Base Classifiers
      1. 2.1 Linear and Quadratic Classifiers
      2. 2.2 Decision Tree Classifiers
      3. 2.3 The Naïve Bayes Classifier
      4. 2.4 Neural Networks
      5. 2.5 Support Vector Machines
      6. 2.6 The <i xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:svg="http://www.w3.org/2000/svg">k</i>-Nearest Neighbor Classifier (-Nearest Neighbor Classifier (<i xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:svg="http://www.w3.org/2000/svg">k</i>-nn)-nn)
      7. 2.7 Final Remarks
      8. Appendix
      9. 2.A.1 Matlab Code for the Fish Data
      10. 2.A.2 Matlab Code for Individual Classifiers
      11. Notes
    5. 3 An Overview of the Field
      1. 3.1 Philosophy
      2. 3.2 Two Examples
      3. 3.3 Structure of the Area
      4. 3.4 Quo Vadis?
      5. Notes
    6. 4 Combining Label Outputs
      1. 4.1 Types of Classifier Outputs
      2. 4.2 A Probabilistic Framework for Combining Label Outputs
      3. 4.3 Majority Vote
      4. 4.4 Weighted Majority Vote
      5. 4.5 NaÏve-Bayes Combiner
      6. 4.6 Multinomial Methods
      7. 4.7 Comparison of Combination Methods for Label Outputs
      8. Appendix
      9. 4.A.1 Matan’s Proof for the Limits on the Majority Vote Accuracy
      10. 4.A.2 Selected Matlab Code
      11. Notes
    7. 5 Combining Continuous-Valued Outputs
      1. 5.1 Decision Profile
      2. 5.2 How Do We Get Probability Outputs?
      3. 5.3 Nontrainable (Fixed) Combination Rules
      4. 5.4 The Weighted Average (Linear Combiner)
      5. 5.5 A Classifier as a Combiner
      6. 5.6 An Example of Nine Combiners for Continuous-Valued Outputs
      7. 5.7 To Train or Not to Train?
      8. Appendix
      9. 5.A.1 Theoretical Classification Error for the Simple Combiners
      10. 5.A.2 Selected Matlab Code
      11. Notes
    8. 6 Ensemble Methods
      1. 6.1 Bagging
      2. 6.2 Random Forests
      3. 6.3 Adaboost
      4. 6.4 Random Subspace Ensembles
      5. 6.5 Rotation Forest
      6. 6.6 Random Linear Oracle
      7. 6.7 Error Correcting Output Codes (ECOC)
      8. Appendix
      9. 6.A.1 Bagging
      10. 6.A.2 AdaBoost
      11. 6.A.3 Random Subspace
      12. 6.A.4 Rotation Forest
      13. 6.A.5 Random Linear Oracle
      14. 6.A.6 Ecoc
      15. Notes
    9. 7 Classifier Selection
      1. 7.1 Preliminaries
      2. 7.2 Why Classifier Selection Works
      3. 7.3 Estimating Local Competence Dynamically
      4. 7.4 Pre-Estimation of the Competence Regions
      5. 7.5 Simultaneous Training of Regions and Classifiers
      6. 7.6 Cascade Classifiers
      7. Appendix: Selected Matlab Code
      8. 7.A.1 Banana Data
      9. 7.A.2 Evolutionary Algorithm for a Selection Ensemble for the Banana Data
    10. 8 Diversity in Classifier Ensembles
      1. 8.1 What is Diversity?
      2. 8.2 Measuring Diversity in Classifier Ensembles
      3. 8.3 Relationship Between Diversity and Accuracy
      4. 8.4 Using Diversity
      5. 8.5 Conclusions: Diversity of Diversity
      6. Appendix
      7. 8.A.1 Derivation of Diversity Measures for Oracle Outputs
      8. 8.A.2 Diversity Measure Equivalence
      9. 8.A.3 Independent Outputs ≠ Independent Errors
      10. 8.A.4 Bound on the Kappa-Error Diagram
      11. 8.A.5 Calculation of the Pareto Frontier
      12. Notes
    11. 9 Ensemble Feature Selection
      1. 9.1 Preliminaries
      2. 9.2 Ranking by Decision Tree Ensembles
      3. 9.3 Ensembles of Rankers
      4. 9.4 Random Feature Selection for the Ensemble
      5. 9.5 Nonrandom Selection
      6. 9.6 A Stability Index
      7. Appendix
      8. 9.A.1 Matlab Code for the Numerical Example of Ensemble Ranking
      9. 9.A.2 Matlab GA Nuggets
      10. 9.A.3 Matlab Code for the Stability Index
    12. 10 A Final Thought
    13. References
    14. Index
    15. End User License Agreement