You are previewing Feature Selection and Ensemble Methods for Bioinformatics.
O'Reilly logo
Feature Selection and Ensemble Methods for Bioinformatics

Book Description

Machine learning is the branch of artificial intelligence whose goal is to develop algorithms that add learning capabilities to computers. Ensembles are an integral part of machine learning. A typical ensemble includes several algorithms performing the task of prediction of the class label or the degree of class membership for a given input presented as a set of measurable characteristics, often called features. Feature Selection and Ensemble Methods for Bioinformatics: Algorithmic Classification and Implementations offers a unique perspective on machine learning aspects of microarray gene expression based cancer classification. This multidisciplinary text is at the intersection of computer science and biology and, as a result, can be used as a reference book by researchers and students from both fields. Each chapter describes the process of algorithm design from beginning to end and aims to inform readers of best practices for use in their own research.

Table of Contents

  1. Cover
  2. Title Page
  3. Copyright Page
  4. Preface
    1. A FEW WORDS ABOUT THE BOOK
    2. References
  5. Chapter 1: Biological Background
    1. A LITTLE BIT OF BIOLOGY
    2. REFERENCES
    3. ENDNOTES
  6. Chapter 2: Gene Expression Data Sets
    1. BIOLOGICAL DATA AND THEIR CHARACTERISTICS
    2. REFERENCES
    3. ENDNOTES
  7. Chapter 3: Introduction to Data Classification
    1. PROBLEM OF DATA CLASSIFICATION
    2. REFERENCES
    3. ENDNOTES
  8. Chapter 4: Naïve Bayes
    1. BAYES AND NAÏVE BAYES
    2. REFERENCES
    3. ENDNOTES
  9. Chapter 5: Nearest Neighbor
    1. NEAREST NEIGHBOR CLASSIFICATION
    2. REFERENCES
    3. ENDNOTES
  10. Chapter 6: Classification Tree
    1. TREE-LIKE CLASSIFIER
    2. REFERENCES
    3. ENDNOTES
  11. Chapter 7: Support Vector Machines
    1. SUPPORT VECTOR MACHINES
    2. REFERENCES
    3. ENDNOTES
  12. Chapter 8: Introduction to Feature and Gene Selection
    1. PROBLEM OF FEATURE SELECTION
    2. REFERENCES
    3. ENDNOTES
  13. Chapter 9: Feature Selection Based on Elements of Game Theory
    1. FEATURE SELECTION BASED ON THE SHAPLEY VALUE
    2. REFERENCES
    3. ENDNOTE
  14. Chapter 10: Kernel-Based Feature Selection with the Hilbert-Schmidt Independence Criterion
    1. KERNEL METHODS AND FEATURE SELECTION
    2. REFERENCES
    3. ENDNOTES
  15. Chapter 11: Extreme Value Distribution Based Gene Selection
    1. BLEND OF ELEMENTS OF EXTREME VALUE THEORY AND LOGISTIC REGRESSION
    2. REFERENCES
    3. ENDNOTES
  16. Chapter 12: Evolutionary Algorithm for Identifying Predictive Genes
    1. EVOLUTIONARY SEARCH FOR OPTIMAL OR NEAR-OPTIMAL SET OF GENES
    2. REFERENCES
    3. ENDNOTES
  17. Chapter 13: Redundancy-Based Feature Selection
    1. REDUNDANCY OF FEATURES
    2. REFERENCES
    3. ENDNOTES
  18. Chapter 14: Unsupervised Feature Selection
    1. UNSUPERVISED FEATURE FILTERING
    2. REFERENCES
  19. Chapter 15: Differential Evolution for Finding Predictive Gene Subsets
    1. DIFFERENTIAL EVOLUTION – GLOBAL, EVOLUTION STRATEGY BASED OPTIMIZATION METHOD
    2. REFERENCES
    3. ENDNOTES
  20. Chapter 16: Ensembles of Classifiers
    1. ENSEMBLE LEARNING
    2. REFERENCES
    3. ENDNOTES
  21. Chapter 17: Classifier Ensembles Built on Subsets of Features
    1. SHAKING STABLE CLASSIFIERS
    2. REFERENCES
    3. ENDNOTES
  22. Chapter 18: Bagging and Random Forests
    1. BOOTSTRAP AND ITS USE IN CLASSIFIER ENSEMBLES
    2. REFERENCES
    3. ENDNOTES
  23. Chapter 19: Boosting and AdaBoost
    1. WEIGHTED LEARNING, BOOSTING AND ADABOOST
    2. REFERENCES
    3. ENDNOTES
  24. Chapter 20: Ensemble Gene Selection
    1. GETTING IMPORTANT GENES OUT OF A POOL
    2. REFERENCES
    3. ENDNOTES
  25. Chapter 21: Introduction to Classification Error Estimation
    1. PROBLEM OF CLASSIFICATION ERROR ESTIMATION
    2. REFERENCES
    3. ENDNOTES
  26. Chapter 22: ROC Curve, Area under it, other Classification Performance Characteristics and Statistical Tests
    1. CLASSIFICATION PERFORMANCE EVALUATION
    2. REFERENCES
    3. ENDNOTES
  27. Chapter 23: Bolstered Resubstitution Error
    1. ALTERNATIVE TO TRADITIONAL ERROR ESTIMATORS
    2. REFERENCES
    3. ENDNOTES
  28. Chapter 24: Performance Evaluation
    1. BAYESIAN CONFIDENCE (CREDIBLE) INTERVAL
    2. REFERENCES
    3. ENDNOTES
  29. Chapter 25: Application Examples
    1. JOINING ALL PIECES TOGETHER
    2. REFERENCES
    3. ENDNOTE
  30. Chapter 26: End Remarks
    1. A FEW WORDS IN THE END
    2. REFERENCES
  31. About the Contributors