You are previewing Beginning R: An Introduction to Statistical Programming.
O'Reilly logo
Beginning R: An Introduction to Statistical Programming

Book Description

Beginning R: An Introduction to Statistical Programming is a hands-on book showing how to use the R language, write and save R scripts, build and import data files, and write your own custom statistical functions. R is a powerful open-source implementation of the statistical language S, which was developed by AT&T. R has eclipsed S and the commercially-available S-Plus language, and has become the de facto standard for doing, teaching, and learning computational statistics.

R is both an object-oriented language and a functional language that is easy to learn, easy to use, and completely free. A large community of dedicated R users and programmers provides an excellent source of R code, functions, and data sets. R is also becoming adopted into commercial tools such as Oracle Database. Your investment in learning R is sure to pay off in the long term as R continues to grow into the go to language for statistical exploration and research.

  • Covers the freely-available R language for statistics

  • Shows the use of R in specific uses case such as simulations, discrete probability solutions, one-way ANOVA analysis, and more

  • Takes a hands-on and example-based approach incorporating best practices with clear explanations of the statistics being done

What you'll learn

  • Acquire and install R

  • Import and export data and scripts

  • Generate basic statistics and graphics

  • Program in R to write custom functions

  • Use R for interactive statistical explorations

  • Implement simulations and other advanced techniques

Who this book is for

Beginning R: An Introduction to Statistical Programming is an easy-to-read book that serves as an instruction manual and reference for working professionals, professors, and students who want to learn and use R for basic statistics. It is the perfect book for anyone needing a free, capable, and powerful tool for exploring statistics and automating their use.

Table of Contents

  1. Title
  2. Dedication
  3. Contents at a Glance
  4. Contents
  5. About the Author
  6. About the Technical Reviewer
  7. Acknowledgments
  8. Introduction
  9. Chapter 1: Getting R and Getting Started
    1. Getting and Using R
    2. A First R Session
    3. Moving Around in R
    4. Working with Data in R
    5. Dealing With Missing Data in R
    6. Conclusion
  10. Chapter 2: Programming in R
    1. What is Programming?
    2. Getting Ready to Program
    3. The Requirements for Learning to Program
    4. Flow Control
    5. Essentials of R Programming
    6. Understanding the R Environment
    7. Implementation of Program Flow in R
    8. A First R Program
    9. Another Example—Finding Pythagorean Triples
    10. Using R to Solve Quadratic Equations
    11. Why R is Object-Oriented
    12. Conclusion
  11. Chapter 3: Writing Reusable Functions
    1. Examining an R Function from the Base R Code
    2. Creating a Function
    3. Calculating a Confidence Interval for a Mean
    4. Avoiding Loops with Vectorized Operations
    5. Vectorizing If-Else Statements Using ifelse()
    6. Making More Powerful Functions
    7. Any, All, and Which
    8. Making Functions More Useful
    9. Confidence Intervals Revisited
    10. Conclusion
  12. Chapter 4: Summary Statistics
    1. Measuring Central Tendency
    2. Measuring Location via Standard Scores
    3. Measuring Variability
    4. Covariance and Correlation
    5. Measuring Symmetry (or Lack Thereof)
    6. Conclusion
  13. Chapter 5: Creating Tables and Graphs
    1. Frequency Distributions and Tables
    2. Pie Charts and Bar Charts
    3. Boxplots
    4. Histograms
    5. Line Graphs
    6. Scatterplots
    7. Saving and Using Graphics
    8. Conclusion
  14. Chapter 6: Discrete Probability Distributions
    1. Discrete Probability Distributions
    2. Bernoulli Processes
    3. Relating Discrete Probability to Normal Probability
    4. Conclusion
  15. Chapter 7: Computing Normal Probabilities
    1. Characteristics of the Normal Distribution
    2. The Sampling Distribution of Means
    3. A One-sample z Test
    4. Conclusion
  16. Chapter 8: Creating Confidence Intervals
    1. Confidence Intervals for Means
    2. Confidence Intervals for Proportions
    3. Understanding the Chi-square Distribution
    4. Confidence Intervals for Variances and Standard Deviations
    5. Confidence Intervals for Differences between Means
    6. Confidence Intervals Using the stats Package
    7. Conclusion
  17. Chapter 9: Performing t Tests
    1. A Brief Introduction to Hypothesis Testing
    2. Understanding the t Distribution
    3. The One-sample t Test
    4. The Paired-samples t Test
    5. Two-sample t Tests
    6. A Note on Effect Size for the t Test
    7. Conclusion
  18. Chapter 10: One-Way Analysis of Variance
    1. Understanding the F Distribution
    2. Using the F Distribution to Test Variances
    3. Compounding Alpha and Post Hoc Comparisons
    4. One-Way ANOVA
    5. Using the anova Function
    6. Conclusion
  19. Chapter 11: Advanced Analysis of Variance
    1. Two-Way ANOVA
    2. Repeated-Measures ANOVA
    3. Mixed-Factorial ANOVA
    4. Conclusion
  20. Chapter 12: Correlation and Regression
    1. Covariance and Correlation
    2. Regression
    3. An Example: Predicting the Price of Gasoline
    4. Determining Confidence and Prediction Intervals
    5. Conclusion
  21. Chapter 13: Multiple Regression
    1. The Multiple Regression Equation
    2. Multiple Regression Example: Predicting Job Satisfaction
    3. Using Matrix Algebra to Solve a Regression Equation
    4. Brief Introduction to the General Linear Model
    5. More on Multiple Regression
    6. Conclusion
  22. Chapter 14: Logistic Regression
    1. What Is Logistic Regression?
    2. Logistic Regression with One Dichotomous Predictor
    3. Logistic Regression with One Continuous Predictor
    4. Logistic Regression with Multiple Predictors
    5. Comparing Logistic and Multiple Regression
    6. Alternatives to Logistic Regression
    7. Conclusion
  23. Chapter 15: Chi-Square Tests
    1. Chi-Square Tests of Goodness of Fit
    2. Chi-Square Tests of Independence
    3. A Special Case: Two-by-Two Contingency Tables
    4. Relating the Standard Normal Distribution to Chi-Square
    5. Effect Size for Chi-Square Tests
    6. Demonstrating the Relationship of Phi to the Correlation Coefficient
    7. Conclusion
  24. Chapter 16: Nonparametric Tests
    1. Nonparametric Alternatives to t Tests
    2. Nonparametric Alternatives to ANOVA
    3. Nonparametric Alternatives to Correlation
    4. Conclusion
  25. Chapter 17: Using R for Simulation
    1. Defining Statistical Simulation
    2. Some Simulations in R
    3. Conclusion
  26. Chapter 18: The “New” Statistics: Resampling and Bootstrapping
    1. The Pitfalls of Hypothesis Testing
    2. The Bootstrap
    3. Jackknifing
    4. Permutation Tests
    5. More on Modern Robust Statistical Methods
    6. Conclusion
  27. Chapter 19: Making an R Package
    1. The Concept of a Package
    2. Some Windows Considerations
    3. Establishing the Skeleton of an R Package
    4. Editing the R Documentation
    5. Building and Checking the Package
    6. Installing the Package
    7. Making Sure the Package Works Correctly
    8. Maintaining Your R Package
    9. Conclusion
  28. Chapter 20: The R Commander Package
    1. The R Commander Interface
    2. Examples of Using R Commander for Data Analysis
    3. Conclusion
  29. Index