You are previewing Computer Vision: Models, Learning, and Inference.
O'Reilly logo
Computer Vision: Models, Learning, and Inference

Book Description

This modern treatment of computer vision focuses on learning and inference in probabilistic models as a unifying theme. It shows how to use training data to learn the relationships between the observed image data and the aspects of the world that we wish to estimate, such as the 3D structure or the object class, and how to exploit these relationships to make new inferences about the world from new image data. With minimal prerequisites, the book starts from the basics of probability and model fitting and works up to real examples that the reader can implement and modify to build useful vision systems. Primarily meant for advanced undergraduate and graduate students, the detailed methodological presentation will also be useful for practitioners of computer vision.

  • Covers cutting-edge techniques, including graph cuts, machine learning and multiple view geometry
  • A unified approach shows the common basis for solutions of important computer vision problems, such as camera calibration, face recognition and object tracking
  • More than 70 algorithms are described in sufficient detail to implement
  • More than 350 full-color illustrations amplify the text
  • The treatment is self-contained, including all of the background mathematics
  • Additional resources at www.computervisionmodels.com

Table of Contents

  1. Cover
  2. Title
  3. Copyright
  4. Dedication
  5. Acknowledgements
  6. Foreword
  7. Preface
  8. Chapter 1: Introduction
    1. Organization of the book
    2. Other books
  9. Part I: Probability
    1. Chapter 2: Introduction to probability
      1. 2.1 Random variables
      2. 2.2 Joint probability
      3. 2.3 Marginalization
      4. 2.4 Conditional probability
      5. 2.5 Bayes’ rule
      6. 2.6 Independence
      7. 2.7 Expectation
    2. Chapter 3: Common probability distributions
      1. 3.1 Bernoulli distribution
      2. 3.2 Beta distribution
      3. 3.3 Categorical distribution
      4. 3.4 Dirichlet distribution
      5. 3.5 Univariate normal distribution
      6. 3.6 Normal-scaled inverse gamma distribution
      7. 3.7 Multivariate normal distribution
      8. 3.8 Normal inverse Wishart distribution
      9. 3.9 Conjugacy
    3. Chapter 4: Fitting probability models
      1. 4.1 Maximum likelihood
      2. 4.2 Maximum a posteriori
      3. 4.3 The Bayesian approach
      4. 4.4 Worked example 1: Univariate normal
      5. 4.5 Worked example 2: Categorical distribution
    4. Chapter 5: The normal distribution
      1. 5.1 Types of covariance matrix
      2. 5.2 Decomposition of covariance
      3. 5.3 Linear transformations of variables
      4. 5.4 Marginal distributions
      5. 5.5 Conditional distributions
      6. 5.6 Product of two normals
      7. 5.7 Change of variable
  10. Part II: Machine learning for machine vision
    1. Chapter 6: Learning and inference in vision
      1. 6.1 Computer vision problems
      2. 6.2 Types of model
      3. 6.3 Example 1: Regression
      4. 6.4 Example 2: Binary classification
      5. 6.5 Which type of model should we use?
      6. 6.6 Applications
    2. Chapter 7: Modeling complex data densities
      1. 7.1 Normal classification model
      2. 7.2 Hidden variables
      3. 7.3 Expectation maximization
      4. 7.4 Mixture of Gaussians
      5. 7.5 The t-distribution
      6. 7.6 Factor analysis
      7. 7.7 Combining models
      8. 7.8 Expectation maximization in detail
      9. 7.9 Applications
    3. Chapter 8: Regression models
      1. 8.1 Linear regression
      2. 8.2 Bayesian linear regression
      3. 8.3 Nonlinear regression
      4. 8.4 Kernels and the kernel trick
      5. 8.5 Gaussian process regression
      6. 8.6 Sparse linear regression
      7. 8.7 Dual linear regression
      8. 8.8 Relevance vector regression
      9. 8.9 Regression to multivariate data
      10. 8.10 Applications
    4. Chapter 9: Classification models
      1. 9.1 Logistic regression
      2. 9.2 Bayesian logistic regression
      3. 9.3 Nonlinear logistic regression
      4. 9.4 Dual logistic regression
      5. 9.5 Kernel logistic regression
      6. 9.6 Relevance vector classification
      7. 9.7 Incremental fitting and boosting
      8. 9.8 Classification trees
      9. 9.9 Multiclass logistic regression
      10. 9.10 Random trees, forests, and ferns
      11. 9.11 Relation to non-probabilistic models
      12. 9.12 Applications
  11. Part III: Connecting local models
    1. Chapter 10: Graphical models
      1. 10.1 Conditional independence
      2. 10.2 Directed graphical models
      3. 10.3 Undirected graphical models
      4. 10.4 Comparing directed and undirected graphical models
      5. 10.5 Graphical models in computer vision
      6. 10.6 Inference in models with many unknowns
      7. 10.7 Drawing samples
      8. 10.8 Learning
    2. Chapter 11: Models for chains and trees
      1. 11.1 Models for chains
      2. 11.2 MAP inference for chains
      3. 11.3 MAP inference for trees
      4. 11.4 Marginal posterior inference for chains
      5. 11.5 Marginal posterior inference for trees
      6. 11.6 Learning in chains and trees
      7. 11.7 Beyond chains and trees
      8. 11.8 Applications
    3. Chapter 12: Models for grids
      1. 12.1 Markov random fields
      2. 12.2 MAP inference for binary pairwise MRFs
      3. 12.3 MAP inference for multilabel pairwise MRFs
      4. 12.4 Multilabel MRFs with non-convex potentials
      5. 12.5 Conditional random fields
      6. 12.6 Higher order models
      7. 12.7 Directed models for grids
      8. 12.8 Applications
  12. Part IV: Preprocessing
    1. Chapter 13: Image preprocessing and feature extraction
      1. 13.1 Per-pixel transformations
      2. 13.2 Edges, corners, and interest points
      3. 13.3 Descriptors
      4. 13.4 Dimensionality reduction
  13. Part V: Models for geometry
    1. Chapter 14: The pinhole camera
      1. 14.1 The pinhole camera
      2. 14.2 Three geometric problems
      3. 14.3 Homogeneous coordinates
      4. 14.4 Learning extrinsic parameters
      5. 14.5 Learning intrinsic parameters
      6. 14.6 Inferring three-dimensional world points
      7. 14.7 Applications
    2. Chapter 15: Models for transformations
      1. 15.1 Two-dimensional transformation models
      2. 15.2 Learning in transformation models
      3. 15.3 Inference in transformation models
      4. 15.4 Three geometric problems for planes
      5. 15.5 Transformations between images
      6. 15.6 Robust learning of transformations
      7. 15.7 Applications
    3. Chapter 16: Multiple cameras
      1. 16.1 Two-view geometry
      2. 16.2 The essential matrix
      3. 16.3 The fundamental matrix
      4. 16.4 Two-view reconstruction pipeline
      5. 16.5 Rectification
      6. 16.6 Multiview reconstruction
      7. 16.7 Applications
  14. Part VI: Models for vision
    1. Chapter 17: Models for shape
      1. 17.1 Shape and its representation
      2. 17.2 Snakes
      3. 17.3 Shape templates
      4. 17.4 Statistical shape models
      5. 17.5 Subspace shape models
      6. 17.6 Three-dimensional shape models
      7. 17.7 Statistical models for shape and appearance
      8. 17.8 Non-Gaussian statistical shape models
      9. 17.9 Articulated models
      10. 17.10 Applications
    2. Chapter 18: Models for style and identity
      1. 18.1 Subspace identity model
      2. 18.2 Probabilistic linear discriminant analysis
      3. 18.3 Nonlinear identity models
      4. 18.4 Asymmetric bilinear models
      5. 18.5 Symmetric bilinear and multilinear models
      6. 18.6 Applications
    3. Chapter 19: Temporal models
      1. 19.1 Temporal estimation framework
      2. 19.2 Kalman filter
      3. 19.3 Extended Kalman filter
      4. 19.4 Unscented Kalman filter
      5. 19.5 Particle filtering
      6. 19.6 Applications
    4. Chapter 20: Models for visual words
      1. 20.1 Images as collections of visual words
      2. 20.2 Bag of words
      3. 20.3 Latent Dirichlet allocation
      4. 20.4 Single author–topic model
      5. 20.5 Constellation models
      6. 20.6 Scene models
      7. 20.7 Applications
  15. Part VII: Appendices
    1. Appendix A: Notation
    2. Appendix B: Optimization
      1. B.1 Problem statement
      2. B.2 Choosing a search direction
      3. B.3 Line search
      4. B.4 Reparameterization
    3. Appendix C: Liner algebra
      1. C.1 Vectors
      2. C.2 Matrices
      3. C.3 Tensors
      4. C.4 Linear transformations
      5. C.5 Singular value decomposition
      6. C.6 Matrix calculus
      7. C.7 Common problems
      8. C.8 Tricks for inverting large matrices
  16. Bibliography
  17. Index