You are previewing Bayesian Logical Data Analysis for the Physical Sciences.
O'Reilly logo
Bayesian Logical Data Analysis for the Physical Sciences

Book Description

Bayesian inference provides a simple and unified approach to data analysis, allowing experimenters to assign probabilities to competing hypotheses of interest, on the basis of the current state of knowledge. By incorporating relevant prior information, it can sometimes improve model parameter estimates by many orders of magnitude. This book provides a clear exposition of the underlying concepts with many worked examples and problem sets. It also discusses implementation, including an introduction to Markov chain Monte-Carlo integration and linear and nonlinear model fitting. Particularly extensive coverage of spectral analysis (detecting and measuring periodic signals) includes a self-contained introduction to Fourier and discrete Fourier methods. There is a chapter devoted to Bayesian inference with Poisson sampling, and three chapters on frequentist methods help to bridge the gap between the frequentist and Bayesian approaches. Supporting Mathematica® notebooks with solutions to selected problems, additional worked examples, and a Mathematica tutorial are available at www.cambridge.org/9780521150125.

Table of Contents

  1. Cover
  2. Half Title
  3. Title Page
  4. Copyright
  5. Contents
  6. Preface
    1. Software support
  7. Acknowledgements
  8. 1. Role of probability theory in science
    1. 1.1 Scientific inference
    2. 1.2 Inference requires a probability theory
      1. 1.2.1 The two rules for manipulating probabilities
    3. 1.3 Usual form of Bayes’ theorem
      1. 1.3.1 Discrete hypothesis space
      2. 1.3.2 Continuous hypothesis space
      3. 1.3.3 Bayes’ theorem – model of the learning process
      4. 1.3.4 Example of the use of Bayes’ theorem
    4. 1.4 Probability and frequency
      1. 1.4.1 Example: incorporating frequency information
    5. 1.5 Marginalization
    6. 1.6 The two basic problems in statistical inference
    7. 1.7 Advantages of the Bayesian approach
    8. 1.8 Problems
  9. 2. Probability theory as extended logic
    1. 2.1 Overview
    2. 2.2 Fundamentals of logic
      1. 2.2.1 Logical propositions
      2. 2.2.2 Compound propositions
      3. 2.2.3 Truth tables and Boolean algebra
      4. 2.2.4 Deductive inference
      5. 2.2.5 Inductive or plausible inference
    3. 2.3 Brief history
    4. 2.4 An adequate set of operations
      1. 2.4.1 Examination of a logic function
    5. 2.5 Operations for plausible inference
      1. 2.5.1 The desiderata of Bayesian probability theory
      2. 2.5.2 Development of the product rule
      3. 2.5.3 Development of sum rule
      4. 2.5.4 Qualitative properties of product and sum rules
    6. 2.6 Uniqueness of the product and sum rules
    7. 2.7 Summary
    8. 2.8 Problems
  10. 3. The how-to of Bayesian inference
    1. 3.1 Overview
    2. 3.2 Basics
    3. 3.3 Parameter estimation
    4. 3.4 Nuisance parameters
    5. 3.5 Model comparison and Occam’s razor
    6. 3.6 Sample spectral line problem
      1. 3.6.1 Background information
    7. 3.7 Odds ratio
      1. 3.7.1 Choice of prior p(T|M[sub(1)], I)
      2. 3.7.2 Calculation of p(D|M[sub(1)], T, I)
      3. 3.7.3 Calculation of p(D|M[sub(2)], I)
      4. 3.7.4 Odds, uniform prior
      5. 3.7.5 Odds, Jeffreys prior
    8. 3.8 Parameter estimation problem
      1. 3.8.1 Sensitivity of odds to T[sub(max)]
    9. 3.9 Lessons
    10. 3.10 Ignorance priors
    11. 3.11 Systematic errors
      1. 3.11.1 Systematic error example
    12. 3.12 Problems
  11. 4. Assigning probabilities
    1. 4.1 Introduction
    2. 4.2 Binomial distribution
      1. 4.2.1 Bernoulli’s law of large numbers
      2. 4.2.2 The gambler’s coin problem
      3. 4.2.3 Bayesian analysis of an opinion poll
    3. 4.3 Multinomial distribution
    4. 4.4 Can you really answer that question?
    5. 4.5 Logical versus causal connections
    6. 4.6 Exchangeable distributions
    7. 4.7 Poisson distribution
      1. 4.7.1 Bayesian and frequentist comparison
    8. 4.8 Constructing likelihood functions
      1. 4.8.1 Deterministic model
      2. 4.8.2 Probabilistic model
    9. 4.9 Summary
    10. 4.10 Problems
  12. 5. Frequentist statistical inference
    1. 5.1 Overview
    2. 5.2 The concept of a random variable
    3. 5.3 Sampling theory
    4. 5.4 Probability distributions
    5. 5.5 Descriptive properties of distributions
      1. 5.5.1 Relative line shape measures for distributions
      2. 5.5.2 Standard random variable
      3. 5.5.3 Other measures of central tendency and dispersion
      4. 5.5.4 Median baseline subtraction
    6. 5.6 Moment generating functions
    7. 5.7 Some discrete probability distributions
      1. 5.7.1 Binomial distribution
      2. 5.7.2 The Poisson distribution
      3. 5.7.3 Negative binomial distribution
    8. 5.8 Continuous probability distributions
      1. 5.8.1 Normal distribution
      2. 5.8.2 Uniform distribution
      3. 5.8.3 Gamma distribution
      4. 5.8.4 Beta distribution
      5. 5.8.5 Negative exponential distribution
    9. 5.9 Central Limit Theorem
    10. 5.10 Bayesian demonstration of the Central Limit Theorem
    11. 5.11 Distribution of the sample mean
      1. 5.11.1 Signal averaging example
    12. 5.12 Transformation of a random variable
    13. 5.13 Random and pseudo-random numbers
      1. 5.13.1 Pseudo-random number generators
      2. 5.13.2 Tests for randomness
    14. 5.14 Summary
    15. 5.15 Problems
  13. 6. What is a statistic?
    1. 6.1 Introduction
    2. 6.2 The χ[sup(2)] distribution
    3. 6.3 Sample variance S[sup(2)]
    4. 6.4 The Student’s t distribution
    5. 6.5 F distribution (F-test)
    6. 6.6 Confidence intervals
      1. 6.6.1 Variance σ[sup(2)] known
      2. 6.6.2 Confidence intervals for μ, unknown variance
      3. 6.6.3 Confidence intervals: difference of two means
      4. 6.6.4 Confidence intervals for σ[sup(2)]
      5. 6.6.5 Confidence intervals: ratio of two variances
    7. 6.7 Summary
    8. 6.8 Problems
  14. 7. Frequentist hypothesis testing
    1. 7.1 Overview
    2. 7.2 Basic idea
      1. 7.2.1 Hypothesis testing with the χ[sup(2)] statistic
      2. 7.2.2 Hypothesis test on the difference of two means
      3. 7.2.3 One-sided and two-sided hypothesis tests
    3. 7.3 Are two distributions the same?
      1. 7.3.1 Pearson χ[sup(2)] goodness-of-fit test
      2. 7.3.2 Comparison of two-binned data sets
    4. 7.4 Problem with frequentist hypothesis testing
      1. 7.4.1 Bayesian resolution to optional stopping problem
    5. 7.5 Problems
  15. 8. Maximum entropy probabilities
    1. 8.1 Overview
    2. 8.2 The maximum entropy principle
    3. 8.3 Shannon’s theorem
    4. 8.4 Alternative justification of MaxEnt
    5. 8.5 Generalizing MaxEnt
      1. 8.5.1 Incorporating a prior
      2. 8.5.2 Continuous probability distributions
    6. 8.6 How to apply the MaxEnt principle
      1. 8.6.1 Lagrange multipliers of variational calculus
    7. 8.7 MaxEnt distributions
      1. 8.7.1 General properties
      2. 8.7.2 Uniform distribution
      3. 8.7.3 Exponential distribution
      4. 8.7.4 Normal and truncated Gaussian distributions
      5. 8.7.5 Multivariate Gaussian distribution
    8. 8.8 MaxEnt image reconstruction
      1. 8.8.1 The kangaroo justification
      2. 8.8.2 MaxEnt for uncertain constraints
    9. 8.9 Pixon multiresolution image reconstruction
    10. 8.10 Problems
  16. 9. Bayesian inference with Gaussian errors
    1. 9.1 Overview
    2. 9.2 Bayesian estimate of a mean
      1. 9.2.1 Mean: known noise σ
      2. 9.2.2 Mean: known noise, unequal σ
      3. 9.2.3 Mean: unknown noise σ
      4. 9.2.4 Bayesian estimate of σ
    3. 9.3 Is the signal variable?
    4. 9.4 Comparison of two independent samples
      1. 9.4.1 Do the samples differ?
      2. 9.4.2 How do the samples differ?
      3. 9.4.3 Results
      4. 9.4.4 The difference in means
      5. 9.4.5 Ratio of the standard deviations
      6. 9.4.6 Effect of the prior ranges
    5. 9.5 Summary
    6. 9.6 Problems
  17. 10. Linear model fitting (Gaussian errors)
    1. 10.1 Overview
    2. 10.2 Parameter estimation
      1. 10.2.1 Most probable amplitudes
      2. 10.2.2 More powerful matrix formulation
    3. 10.3 Regression analysis
    4. 10.4 The posterior is a Gaussian
      1. 10.4.1 Joint credible regions
    5. 10.5 Model parameter errors
      1. 10.5.1 Marginalization and the covariance matrix
      2. 10.5.2 Correlation coefficient
      3. 10.5.3 More on model parameter errors
    6. 10.6 Correlated data errors
    7. 10.7 Model comparison with Gaussian posteriors
    8. 10.8 Frequentist testing and errors
      1. 10.8.1 Other model comparison methods
    9. 10.9 Summary
    10. 10.10 Problems
  18. 11. Nonlinear model fitting
    1. 11.1 Introduction
    2. 11.2 Asymptotic normal approximation
    3. 11.3 Laplacian approximations
      1. 11.3.1 Bayes factor
      2. 11.3.2 Marginal parameter posteriors
    4. 11.4 Finding the most probable parameters
      1. 11.4.1 Simulated annealing
      2. 11.4.2 Genetic algorithm
    5. 11.5 Iterative linearization
      1. 11.5.1 Levenberg–Marquardt method
      2. 11.5.2 Marquardt’s recipe
    6. 11.6 Mathematica example
      1. 11.6.1 Model comparison
      2. 11.6.2 Marginal and projected distributions
    7. 11.7 Errors in both coordinates
    8. 11.8 Summary
    9. 11.9 Problems
  19. 12. Markov chain Monte Carlo
    1. 12.1 Overview
    2. 12.2 Metropolis–Hastings algorithm
    3. 12.3 Why does Metropolis–Hastings work?
    4. 12.4 Simulated tempering
    5. 12.5 Parallel tempering
    6. 12.6 Example
    7. 12.7 Model comparison
    8. 12.8 Towards an automated MCMC
    9. 12.9 Extrasolar planet example
      1. 12.9.1 Model probabilities
      2. 12.9.2 Results
    10. 12.10 MCMC robust summary statistic
    11. 12.11 Summary
    12. 12.12 Problems
  20. 13. Bayesian revolution in spectral analysis
    1. 13.1 Overview
    2. 13.2 New insights on the periodogram
      1. 13.2.1 How to compute p(f|D, I)
    3. 13.3 Strong prior signal model
    4. 13.4 No specific prior signal model
      1. 13.4.1 X-ray astronomy example
      2. 13.4.2 Radio astronomy example
    5. 13.5 Generalized Lomb–Scargle periodogram
      1. 13.5.1 Relationship to Lomb–Scargle periodogram
      2. 13.5.2 Example
    6. 13.6 Non-uniform sampling
    7. 13.7 Problems
  21. 14. Bayesian inference with Poisson sampling
    1. 14.1 Overview
    2. 14.2 Infer a Poisson rate
      1. 14.2.1 Summary of posterior
    3. 14.3 Signal + known background
    4. 14.4 Analysis of ON/OFF measurements
      1. 14.4.1 Estimating the source rate
      2. 14.4.2 Source detection question
    5. 14.5 Time-varying Poisson rate
    6. 14.6 Problems
  22. Appendix A: Singular value decomposition
  23. Appendix B: Discrete Fourier Transforms
    1. B.1 Overview
    2. B.2 Orthogonal and orthonormal functions
    3. B.3 Fourier series and integral transform
      1. B.3.1 Fourier series
      2. B.3.2 Fourier transform
    4. B.4 Convolution and correlation
      1. B.4.1 Convolution theorem
      2. B.4.2 Correlation theorem
      3. B.4.3 Importance of convolution in science
    5. B.5 Waveform sampling
    6. B.6 Nyquist sampling theorem
      1. B.6.1 Astronomy example
    7. B.7 Discrete Fourier Transform
      1. B.7.1 Graphical development
      2. B.7.2 Mathematical development of the DFT
      3. B.7.3 Inverse DFT
    8. B.8 Applying the DFT
      1. B.8.1 DFT as an approximate Fourier transform
      2. B.8.2 Inverse discrete Fourier transform
    9. B.9 The Fast Fourier Transform
    10. B.10 Discrete convolution and correlation
      1. B.10.1 Deconvolving a noisy signal
      2. B.10.2 Deconvolution with an optimal Weiner filter
      3. B.10.3 Treatment of end effects by zero padding
    11. B.11 Accurate amplitudes by zero padding
    12. B.12 Power-spectrum estimation
      1. B.12.1 Parseval’s theorem and power spectral density
      2. B.12.2 Periodogram power-spectrum estimation
      3. B.12.3 Correlation spectrum estimation
    13. B.13 Discrete power spectral density estimation
      1. B.13.1 Discrete form of Parseval’s theorem
      2. B.13.2 One-sided discrete power spectral density
      3. B.13.3 Variance of periodogram estimate
      4. B.13.4 Yule’s stochastic spectrum estimation model
      5. B.13.5 Reduction of periodogram variance
    14. B.14 Problems
  24. Appendix C: Difference in two samples
    1. C.1 Outline
    2. C.2 Probabilities of the four hypotheses
      1. C.2.1 Evaluation of p(C, S|D[sub(1)], D[sub(2)], I)
      2. C.2.2 Evaluation of p(C, S[bar]|D[sub(1)], D[sub(2)], I)
      3. C.2.3 Evaluation of p(C[bar], S|D[sub(1)], D[sub(2)], I)
      4. C.2.4 Evaluation of p(C[bar], S[bar]|D[sub(1)], D[sub(2)], I)
    3. C.3 The difference in the means
      1. C.3.1 The two-sample problem
      2. C.3.2 The Behrens–Fisher problem
    4. C.4 The ratio of the standard deviations
      1. C.4.1 Estimating the ratio, given the means are the same
      2. C.4.2 Estimating the ratio, given the means are different
  25. Appendix D: Poisson ON/OFF details
    1. D.1 Derivation of p(s|N[sub(on)], I)
      1. D.1.1 Evaluation of Num
      2. D.1.2 Evaluation of Den
    2. D.2 Derivation of the Bayes factor B[sub({s+b,b})]
  26. Appendix E: Multivariate Gaussian from maximum entropy
  27. References
  28. Index