You are previewing A Course in Statistics with R.
O'Reilly logo
A Course in Statistics with R

Book Description

Integrates the theory and applications of statistics using R A Course in Statistics with R has been written to bridge the gap between theory and applications and explain how mathematical expressions are converted into R programs. The book has been primarily designed as a useful companion for a Masters student during each semester of the course, but will also help applied statisticians in revisiting the underpinnings of the subject. With this dual goal in mind, the book begins with R basics and quickly covers visualization and exploratory analysis. Probability and statistical inference, inclusive of classical, nonparametric, and Bayesian schools, is developed with definitions, motivations, mathematical expression and R programs in a way which will help the reader to understand the mathematical development as well as R implementation. Linear regression models, experimental designs, multivariate analysis, and categorical data analysis are treated in a way which makes effective use of visualization techniques and the related statistical techniques underlying them through practical applications, and hence helps the reader to achieve a clear understanding of the associated statistical models.

Key features:

  • Integrates R basics with statistical concepts
  • Provides graphical presentations inclusive of mathematical expressions
  • Aids understanding of limit theorems of probability with and without the simulation approach
  • Presents detailed algorithmic development of statistical models from scratch
  • Includes practical applications with over 50 data sets

Table of Contents

  1. Cover
  2. Title Page
    1. Copyright
    2. Dedication
  3. List of Figures
  4. List of Tables
  5. Preface
  6. Acknowledgments
  7. Part I: The Preliminaries
    1. Chapter 1: Why R?
      1. 1.1 Why R?
      2. 1.2 R Installation
      3. 1.3 There is Nothing such as PRACTICALS
      4. 1.4 Datasets in R and Internet
      5. 1.5 http://cran.r-project.org
      6. 1.6 R and its Interface with other Software
      7. 1.7 help and/or ?
      8. 1.8 R Books
      9. 1.9 A Road Map
    2. Chapter 2: The R Basics
      1. 2.1 Introduction
      2. 2.2 Simple Arithmetics and a Little Beyond
      3. 2.3 Some Basic R Functions
      4. 2.4 Vectors and Matrices in R
      5. 2.5 Data Entering and Reading from Files
      6. 2.6 Working with Packages
      7. 2.7 R Session Management
      8. 2.8 Further Reading
      9. 2.9 Complements, Problems, and Programs
    3. Chapter 3: Data Preparation and Other Tricks
      1. 3.1 Introduction
      2. 3.2 Manipulation with Complex Format Files
      3. 3.3 Reading Datasets of Foreign Formats
      4. 3.4 Displaying R Objects
      5. 3.5 Manipulation Using R Functions
      6. 3.6 Working with Time and Date
      7. 3.7 Text Manipulations
      8. 3.8 Scripts and Text Editors for R
      9. 3.9 Further Reading
      10. 3.10 Complements, Problems, and Programs
    4. Chapter 4: Exploratory Data Analysis
      1. 4.1 Introduction: The Tukey's School of Statistics
      2. 4.2 Essential Summaries of EDA
      3. 4.3 Graphical Techniques in EDA
      4. 4.4 Quantitative Techniques in EDA
      5. 4.5 Exploratory Regression Models
      6. 4.6 Further Reading
      7. 4.7 Complements, Problems, and Programs
  8. Part II: Probability and Inference
    1. Chapter 5: Probability Theory
      1. 5.1 Introduction
      2. 5.2 Sample Space, Set Algebra, and Elementary Probability
      3. 5.3 Counting Methods
      4. 5.4 Probability: A Definition
      5. 5.5 Conditional Probability and Independence
      6. 5.6 Bayes Formula
      7. 5.7 Random Variables, Expectations, and Moments
      8. 5.8 Distribution Function, Characteristic Function, and Moment Generation Function
      9. 5.9 Inequalities
      10. 5.10 Convergence of Random Variables
      11. 5.11 The Law of Large Numbers
      12. 5.12 The Central Limit Theorem
      13. 5.13 Further Reading
      14. 5.14 Complements, Problems, and Programs
    2. Chapter 6: Probability and Sampling Distributions
      1. 6.1 Introduction
      2. 6.2 Discrete Univariate Distributions
      3. 6.3 Continuous Univariate Distributions
      4. 6.4 Multivariate Probability Distributions
      5. 6.5 Populations and Samples
      6. 6.6 Sampling from the Normal Distributions
      7. 6.7 Some Finer Aspects of Sampling Distributions
      8. 6.8 Multivariate Sampling Distributions
      9. 6.9 Bayesian Sampling Distributions
      10. 6.10 Further Reading
      11. 6.11 Complements, Problems, and Programs
    3. Chapter 7: Parametric Inference
      1. 7.1 Introduction
      2. 7.2 Families of Distribution
      3. 7.3 Loss Functions
      4. 7.4 Data Reduction
      5. 7.5 Likelihood and Information
      6. 7.6 Point Estimation
      7. 7.7 Comparison of Estimators
      8. 7.8 Confidence Intervals
      9. 7.9 Testing Statistical Hypotheses–The Preliminaries
      10. 7.10 The Neyman-Pearson Lemma
      11. 7.11 Uniformly Most Powerful Tests
      12. 7.12 Uniformly Most Powerful Unbiased Tests
      13. 7.13 Likelihood Ratio Tests
      14. 7.14 Behrens-Fisher Problem
      15. 7.15 Multiple Comparison Tests
      16. 7.16 The EM Algorithm*
      17. 7.17 Further Reading
      18. 7.18 Complements, Problems, and Programs
    4. Chapter 8: Nonparametric Inference
      1. 8.1 Introduction
      2. 8.2 Empirical Distribution Function and Its Applications
      3. 8.3 The Jackknife and Bootstrap Methods
      4. 8.4 Non-parametric Smoothing
      5. 8.5 Non-parametric Tests
      6. 8.6 Further Reading
      7. 8.7 Complements, Problems, and Programs
    5. Chapter 9: Bayesian Inference
      1. 9.1 Introduction
      2. 9.2 Bayesian Probabilities
      3. 9.3 The Bayesian Paradigm for Statistical Inference
      4. 9.4 Bayesian Estimation
      5. 9.5 The Credible Intervals
      6. 9.6 Bayes Factors for Testing Problems
      7. 9.7 Further Reading
      8. 9.8 Complements, Problems, and Programs
  9. Part III: Stochastic Processes and Monte Carlo
    1. Chapter 10: Stochastic Processes
      1. 10.1 Introduction
      2. 10.2 Kolmogorov's Consistency Theorem
      3. 10.3 Markov Chains
      4. 10.4 Application of Markov Chains in Computational Statistics
      5. 10.5 Further Reading
      6. 10.6 Complements, Problems, and Programs
    2. Chapter 11: Monte Carlo Computations
      1. 11.1 Introduction
      2. 11.2 Generating the (Pseudo-) Random Numbers
      3. 11.3 Simulation from Probability Distributions and Some Limit Theorems
      4. 11.4 Monte Carlo Integration
      5. 11.5 The Accept-Reject Technique
      6. 11.6 Application to Bayesian Inference
      7. 11.7 Further Reading
      8. 11.8 Complements, Problems, and Programs
  10. Part IV: Linear Models
    1. Chapter 12: Linear Regression Models
      1. 12.1 Introduction
      2. 12.2 Simple Linear Regression Model
      3. 12.3 The Anscombe Warnings and Regression Abuse
      4. 12.4 Multiple Linear Regression Model
      5. 12.5 Model Diagnostics for the Multiple Regression Model
      6. 12.6 Multicollinearity
      7. 12.7 Data Transformations
      8. 12.8 Model Selection
      9. 12.9 Further Reading
      10. 12.10 Complements, Problems, and Programs
    2. Chapter 13: Experimental Designs
      1. 13.1 Introduction
      2. 13.2 Principles of Experimental Design
      3. 13.3 Completely Randomized Designs
      4. 13.4 Block Designs
      5. 13.5 Factorial Designs
      6. 13.6 Further Reading
      7. 13.7 Complements, Problems, and Programs
    3. Chapter 14: Multivariate Statistical Analysis - I
      1. 14.1 Introduction
      2. 14.2 Graphical Plots for Multivariate Data
      3. 14.3 Definitions, Notations, and Summary Statistics for Multivariate Data
      4. 14.4 Testing for Mean Vectors : One Sample
      5. 14.5 Testing for Mean Vectors : Two-Samples
      6. 14.6 Multivariate Analysis of Variance
      7. 14.7 Testing for Variance-Covariance Matrix: One Sample
      8. 14.8 Testing for Variance-Covariance Matrix: <img xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:svg="http://www.w3.org/2000/svg" xmlns:ibooks="http://vocabulary.itunes.apple.com/rdf/ibooks/vocabulary-extensions-1.0" src="images/c14-math-0301.png" alt="c14-math-0301" style="vertical-align:middle;"></img>-Samples-Samples
      9. 14.9 Testing for Independence of Sub-vectors
      10. 14.10 Further Reading
      11. 14.11 Complements, Problems, and Programs
    4. Chapter 15: Multivariate Statistical Analysis - II
      1. 15.1 Introduction
      2. 15.2 Classification and Discriminant Analysis
      3. 15.3 Canonical Correlations
      4. 15.4 Principal Component Analysis – Theory and Illustration
      5. 15.5 Applications of Principal Component Analysis
      6. 15.6 Factor Analysis
      7. 15.7 Further Reading
      8. 15.8 Complements, Problems, and Programs
    5. Chapter 16: Categorical Data Analysis
      1. 16.1 Introduction
      2. 16.2 Graphical Methods for CDA
      3. 16.3 The Odds Ratio
      4. 16.4 The Simpson's Paradox
      5. 16.5 The Binomial, Multinomial, and Poisson Models
      6. 16.6 The Problem of Overdispersion
      7. 16.7 The <img xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:svg="http://www.w3.org/2000/svg" xmlns:ibooks="http://vocabulary.itunes.apple.com/rdf/ibooks/vocabulary-extensions-1.0" src="images/c16-math-0097.png" alt="c16-math-0097" style="vertical-align:middle;"></img>- Tests of Independence- Tests of Independence
      8. 16.8 Further Reading
      9. 16.9 Complements, Problems, and Programs
    6. Chapter 17: Generalized Linear Models
      1. 17.1 Introduction
      2. 17.2 Regression Problems in Count/Discrete Data
      3. 17.3 Exponential Family and the GLM
      4. 17.4 The Logistic Regression Model
      5. 17.5 Inference for the Logistic Regression Model
      6. 17.6 Model Selection in Logistic Regression Models
      7. 17.7 Probit Regression
      8. 17.8 Poisson Regression Model
      9. 17.9 Further Reading
      10. 17.10 Complements, Problems, and Programs
  11. Appendix A: Open Source Software–An Epilogue
  12. Appendix B: The Statistical Tables
  13. Bibliography
    1. Author Index
    2. Subject Index
    3. R Codes
  14. End User License Agreement