You are previewing R in a Nutshell, 2nd Edition.

R in a Nutshell, 2nd Edition

Cover of R in a Nutshell, 2nd Edition by Joseph Adler Published by O'Reilly Media, Inc.
  1. R in a Nutshell
  2. Preface
    1. Why I Wrote This Book
    2. When Should You Use R?
    3. What’s New in the Second Edition?
    4. R License Terms
    5. Examples
    6. How This Book Is Organized
    7. Conventions Used in This Book
    8. Using Code Examples
    9. Safari® Books Online
    10. How to Contact Us
    11. Acknowledgments
  3. I. R Basics
    1. 1. Getting and Installing R
      1. R Versions
      2. Getting and Installing Interactive R Binaries
    2. 2. The R User Interface
      1. The R Graphical User Interface
      2. The R Console
      3. Batch Mode
      4. Using R Inside Microsoft Excel
      5. RStudio
      6. Other Ways to Run R
    3. 3. A Short R Tutorial
      1. Basic Operations in R
      2. Functions
      3. Variables
      4. Introduction to Data Structures
      5. Objects and Classes
      6. Models and Formulas
      7. Charts and Graphics
      8. Getting Help
    4. 4. R Packages
      1. An Overview of Packages
      2. Listing Packages in Local Libraries
      3. Loading Packages
      4. Exploring Package Repositories
      5. Installing Packages From Other Repositories
      6. Custom Packages
  4. II. The R Language
    1. 5. An Overview of the R Language
      1. Expressions
      2. Objects
      3. Symbols
      4. Functions
      5. Objects Are Copied in Assignment Statements
      6. Everything in R Is an Object
      7. Special Values
      8. Coercion
      9. The R Interpreter
      10. Seeing How R Works
    2. 6. R Syntax
      1. Constants
      2. Operators
      3. Expressions
      4. Control Structures
      5. Accessing Data Structures
      6. R Code Style Standards
    3. 7. R Objects
      1. Primitive Object Types
      2. Vectors
      3. Lists
      4. Other Objects
      5. Attributes
    4. 8. Symbols and Environments
      1. Symbols
      2. Working with Environments
      3. The Global Environment
      4. Environments and Functions
      5. Exceptions
    5. 9. Functions
      1. The Function Keyword
      2. Arguments
      3. Return Values
      4. Functions as Arguments
      5. Argument Order and Named Arguments
      6. Side Effects
    6. 10. Object-Oriented Programming
      1. Overview of Object-Oriented Programming in R
      2. Object-Oriented Programming in R: S4 Classes
      3. Old-School OOP in R: S3
  5. III. Working with Data
    1. 11. Saving, Loading, and Editing Data
      1. Entering Data Within R
      2. Saving and Loading R Objects
      3. Importing Data from External Files
      4. Exporting Data
      5. Importing Data From Databases
      6. Getting Data from Hadoop
    2. 12. Preparing Data
      1. Combining Data Sets
      2. Transformations
      3. Binning Data
      4. Subsets
      5. Summarizing Functions
      6. Data Cleaning
      7. Finding and Removing Duplicates
      8. Sorting
  6. IV. Data Visualization
    1. 13. Graphics
      1. An Overview of R Graphics
      2. Graphics Devices
      3. Customizing Charts
    2. 14. Lattice Graphics
      1. History
      2. An Overview of the Lattice Package
      3. High-Level Lattice Plotting Functions
      4. Customizing Lattice Graphics
      5. Low-Level Functions
    3. 15. ggplot2
      1. A Short Introduction
      2. The Grammar of Graphics
      3. A More Complex Example: Medicare Data
      4. Quick Plot
      5. Creating Graphics with ggplot2
      6. Learning More
  7. V. Statistics with R
    1. 16. Analyzing Data
      1. Summary Statistics
      2. Correlation and Covariance
      3. Principal Components Analysis
      4. Factor Analysis
      5. Bootstrap Resampling
    2. 17. Probability Distributions
      1. Normal Distribution
      2. Common Distribution-Type Arguments
      3. Distribution Function Families
    3. 18. Statistical Tests
      1. Continuous Data
      2. Discrete Data
    4. 19. Power Tests
      1. Experimental Design Example
      2. t-Test Design
      3. Proportion Test Design
      4. ANOVA Test Design
    5. 20. Regression Models
      1. Example: A Simple Linear Model
      2. Details About the lm Function
      3. Subset Selection and Shrinkage Methods
      4. Nonlinear Models
      5. Survival Models
      6. Smoothing
      7. Machine Learning Algorithms for Regression
    6. 21. Classification Models
      1. Linear Classification Models
      2. Machine Learning Algorithms for Classification
    7. 22. Machine Learning
      1. Market Basket Analysis
      2. Clustering
    8. 23. Time Series Analysis
      1. Autocorrelation Functions
      2. Time Series Models
  8. VI. Additional Topics
    1. 24. Optimizing R Programs
      1. Measuring R Program Performance
      2. Optimizing Your R Code
      3. Other Ways to Speed Up R
    2. 25. Bioconductor
      1. An Example
      2. Key Bioconductor Packages
      3. Data Structures
      4. Where to Go Next
    3. 26. R and Hadoop
      1. R and Hadoop
      2. Other Packages for Parallel Computation with R
      3. Where to Learn More
  9. A. R Reference
    1. base
      1. Functions
      2. Data Sets
    2. boot
      1. Functions
      2. Data Sets
    3. class
      1. Functions
    4. cluster
      1. Functions
      2. Data Sets
    5. codetools
    6. foreign
      1. Functions
    7. grDevices
      1. Functions
      2. Data Sets
    8. graphics
      1. Functions
    9. grid
    10. KernSmooth
      1. Functions
    11. lattice
      1. Functions
      2. Data Sets
    12. MASS
      1. Functions
      2. Data Sets
    13. methods
      1. Functions
    14. mgcv
    15. nlme
    16. nnet
      1. Functions
    17. rpart
      1. Functions
      2. Data Sets
    18. spatial
      1. Functions
    19. splines
      1. Functions
    20. stats
      1. Functions
      2. Data Set
    21. stats4
      1. Functions
    22. survival
      1. Functions
      2. Data Sets
    23. tcltk
    24. tools
      1. Functions
      2. Data Sets
    25. utils
      1. Functions
  10. Bibliography
  11. Index
  12. About the Author
  13. Colophon
  14. Copyright
O'Reilly logo

Charts and Graphics

R includes several packages for visualizing data: graphics, grid, and lattice. Usually, you’ll find that functions within the graphics and lattice packages are the most useful.[10] If you’re familiar with Microsoft Excel, you’ll find that R can generate all of the charts that you’re familiar with: column charts, bar charts, line plots, pie charts, and scatter plots. Even if that’s all you need, R makes it much easier than Excel to automate the creation of charts and to customize them. However, there are many, many more types of charts available in R, many of them quite intuitive and elegant.

To make this a little more interesting, let’s work with some real data. We’re going to look at all field goal attempts in the National Football League (NFL) in 2005.[11] For those of you who aren’t familiar with American football, here’s a quick explanation. A team can attempt to kick a football between a set of goalposts to receive 3 points. If it misses the field goal, possession of the ball reverts to the other team (at the spot on the field where the kick was attempted). We’re going to take a look at kick attempts in the NFL in 2005.

First, let’s take a quick look at the distribution of distances. R provides a function, hist, that can do this quickly for us. Let’s start by loading the appropriate data set. (The data set is included in the nutshell package; see the Preface for information on how to obtain this package.)

> library(nutshell)
> data(field.goals)

Let’s take a ...

The best content for your career. Discover unlimited learning on demand for around $1/day.