You are previewing Using R for Statistics.
O'Reilly logo
Using R for Statistics

Book Description

"

R is a popular and growing open source statistical analysis and graphics environment as well as a programming language and platform. If you need to use a variety of statistics, then Using R for Statistics will get you the answers to most of the problems you are likely to encounter.

Using R for Statistics is a problem-solution primer for using R to set up your data, pose your problems and get answers using a wide array of statistical tests. The book walks you through R basics and how to use R to accomplish a wide variety statistical operations. You'll be able to navigate the R system, enter and import data, manipulate datasets, calculate summary statistics, create statistical plots and customize their appearance, perform hypothesis tests such as the t-tests and analyses of variance, and build regression models. Examples are built around actual datasets to simulate real-world solutions, and programming basics are explained to assist those who do not have a development background.

After reading and using this guide, you'll be comfortable using and applying R to your specific statistical analyses or hypothesis tests. No prior knowledge of R or of programming is assumed, though you should have some experience with statistics.

"

Table of Contents

  1. Cover
  2. Title
  3. Copyright
  4. Contents at a Glance
  5. Contents
  6. About the Author
  7. About the Technical Reviewer
  8. Acknowledgments
  9. Introduction
  10. Chapter 1: R Fundamentals
    1. Downloading and Installing R
    2. Getting Orientated
    3. The R Console and Command Prompt
    4. Functions
    5. Objects
      1. Simple Objects
      2. Vectors
      3. Data Frames
    6. The Data Editor
    7. Workspaces
    8. Error Messages
    9. Script Files
    10. Summary
  11. Chapter 2: Working with Data Files
    1. Entering Data Directly
    2. Importing Plain Text Files
      1. CSV and Tab-Delimited Files
      2. DIF Files
      3. Other Plain Text Files
    3. Importing Excel Files
    4. Importing Files from Other Software
    5. Using Relative File Paths
    6. Exporting Datasets
    7. Summary
  12. Chapter 3: Preparing and Manipulating Your Data
    1. Variables
      1. Rearranging and Removing Variables
      2. Renaming Variables
      3. Variable Classes
    2. Calculating New Numeric Variables
    3. Dividing a Continuous Variable into Categories
    4. Working with Factor Variables
    5. Manipulating Character Variables
      1. Concatenating Character Strings
      2. Extracting a Substring
      3. Searching a Character Variable
    6. Working with Dates and Times
    7. Adding and Removing Observations
      1. Adding New Observations
      2. Removing Specific Observations
      3. Removing Duplicate Observations
    8. Selecting a Subset of the Data
      1. Selecting a Subset According to Selection Criteria
      2. Selecting a Random Sample from a Dataset
    9. Sorting a Dataset
    10. Summary
  13. Chapter 4: Combining and Restructuring Datasets
    1. Appending Rows
    2. Appending Columns
    3. Merging Datasets by Common Variables
    4. Stacking and Unstacking a Dataset
      1. Stacking Data
      2. Unstacking Data
    5. Reshaping a Dataset
    6. Summary
  14. Chapter 5: Summary Statistics for Continuous Variables
    1. Univariate Statistics
    2. Statistics by Group
    3. Measures of Association
      1. Covariance
      2. Pearson’s Correlation Coefficient
      3. Spearman’s Rank Correlation Coefficient
    4. Hypothesis Test of Correlation
    5. Comparing a Sample with a Specified Distribution
      1. Shapiro-Wilk Test
      2. Kolmogorov-Smirnov Test
    6. Confidence Intervals and Prediction Intervals
    7. Summary
  15. Chapter 6: Tabular Data
    1. Frequency Tables
      1. Creating Tables
      2. Displaying Tables
      3. Creating Tables from Count Data
      4. Creating a Table Directly
    2. Chi-Square Goodness-of-Fit Test
    3. Tests of Association Between Categorical Variables
      1. Chi-Square Test of Association
      2. Fisher’s Exact Test
    4. Proportions test
    5. Summary
  16. Chapter 7: Probability Distributions
    1. Probability Distributions in R
    2. Probability Density Functions and Probability Mass Functions
    3. Finding Probabilities
    4. Finding Quantiles
    5. Generating Random Numbers
    6. Summary
  17. Chapter 8: Creating Plots
    1. Simple Plots
    2. Histograms
    3. Normal Probability Plots
    4. Stem-and-Leaf Plots
    5. Bar Charts
    6. Pie Charts
    7. Scatter Plots
    8. Scatterplot Matrices
    9. Box Plots
    10. Plotting a Function
    11. Exporting and Saving Plots
    12. Summary
  18. Chapter 9: Customizing Your Plots
    1. Titles and Labels
    2. Axes
    3. Colors
    4. Plotting Symbols
    5. Plotting Lines
    6. Shaded Areas
    7. Adding Items to Plots
      1. Adding Straight Lines
      2. Adding a Mathematical Function Curve
      3. Adding Labels and Text
      4. Adding a Grid
      5. Adding Arrows
    8. Overlaying Plots
    9. Adding a Legend
    10. Multiple Plots in the Plotting Area
    11. Changing the Default Plot Settings
    12. Summary
  19. Chapter 10: Hypothesis Testing
    1. Student’s T-Tests
      1. One-Sample T-Test
      2. Two-Sample T-Test
      3. Paired T-Test
    2. Wilcoxon Rank-Sum Test
    3. Analysis of Variance
    4. Kruskal-Wallis Test
    5. Multiple Comparison Methods
      1. Tukey’s HSD Test
      2. Other Pairwise T-Tests
      3. Pairwise Wilcoxon Rank-Sum Tests
    6. Hypothesis Tests for Variance
      1. F-Test
      2. Bartlett’s Test
    7. Summary
  20. Chapter 11: Regression and General Linear Models
    1. Building the Model
      1. Simple Linear Regression
      2. Multiple Linear Regression
      3. Interaction Terms
      4. Polynomial Terms
      5. Transformations
      6. The Intercept Term
      7. Including Factor Variables
      8. Updating a Model
      9. Stepwise Model Selection Procedures
    2. Assessing the Fit of the Model
    3. Coefficient Estimates
    4. Plotting the Line of Best Fit
    5. Model Diagnostics
      1. Residual Analysis
      2. Leverage
      3. Cook’s Distances
    6. Making Predictions
    7. Summary
  21. Appendix A: Add-On Packages
    1. Viewing a List of Available Add-on Packages
    2. Installing and Loading Add-On Packages
      1. Windows Users
      2. Mac Users
      3. Linux Users
  22. Appendix B: Basic Programming with R
    1. Creating New Functions
    2. Conditional Statements
      1. Conditions
      2. If Statement
      3. If/else Statement
      4. The switch Function
    3. Loops
      1. For Loop
      2. While Loop
    4. Summary
  23. Appendix C: Datasets
    1. apartments
    2. bigcats
    3. bottles
    4. brains
    5. CIAdata1, CIAdata2
    6. coffeeshop
    7. concrete
    8. CPIdata
    9. customers
    10. endangered
    11. fiveyearreport
    12. flights
    13. fruit
    14. grades1
    15. people
    16. people2
    17. powerplant
    18. pulserates
    19. resistance
    20. supermarkets
    21. vitalsigns
    22. WHOdata
  24. Index