You are previewing R in a Nutshell, 2nd Edition.

R in a Nutshell, 2nd Edition

Cover of R in a Nutshell, 2nd Edition by Joseph Adler Published by O'Reilly Media, Inc.
  1. R in a Nutshell
  2. Preface
    1. Why I Wrote This Book
    2. When Should You Use R?
    3. What’s New in the Second Edition?
    4. R License Terms
    5. Examples
    6. How This Book Is Organized
    7. Conventions Used in This Book
    8. Using Code Examples
    9. Safari® Books Online
    10. How to Contact Us
    11. Acknowledgments
  3. I. R Basics
    1. 1. Getting and Installing R
      1. R Versions
      2. Getting and Installing Interactive R Binaries
    2. 2. The R User Interface
      1. The R Graphical User Interface
      2. The R Console
      3. Batch Mode
      4. Using R Inside Microsoft Excel
      5. RStudio
      6. Other Ways to Run R
    3. 3. A Short R Tutorial
      1. Basic Operations in R
      2. Functions
      3. Variables
      4. Introduction to Data Structures
      5. Objects and Classes
      6. Models and Formulas
      7. Charts and Graphics
      8. Getting Help
    4. 4. R Packages
      1. An Overview of Packages
      2. Listing Packages in Local Libraries
      3. Loading Packages
      4. Exploring Package Repositories
      5. Installing Packages From Other Repositories
      6. Custom Packages
  4. II. The R Language
    1. 5. An Overview of the R Language
      1. Expressions
      2. Objects
      3. Symbols
      4. Functions
      5. Objects Are Copied in Assignment Statements
      6. Everything in R Is an Object
      7. Special Values
      8. Coercion
      9. The R Interpreter
      10. Seeing How R Works
    2. 6. R Syntax
      1. Constants
      2. Operators
      3. Expressions
      4. Control Structures
      5. Accessing Data Structures
      6. R Code Style Standards
    3. 7. R Objects
      1. Primitive Object Types
      2. Vectors
      3. Lists
      4. Other Objects
      5. Attributes
    4. 8. Symbols and Environments
      1. Symbols
      2. Working with Environments
      3. The Global Environment
      4. Environments and Functions
      5. Exceptions
    5. 9. Functions
      1. The Function Keyword
      2. Arguments
      3. Return Values
      4. Functions as Arguments
      5. Argument Order and Named Arguments
      6. Side Effects
    6. 10. Object-Oriented Programming
      1. Overview of Object-Oriented Programming in R
      2. Object-Oriented Programming in R: S4 Classes
      3. Old-School OOP in R: S3
  5. III. Working with Data
    1. 11. Saving, Loading, and Editing Data
      1. Entering Data Within R
      2. Saving and Loading R Objects
      3. Importing Data from External Files
      4. Exporting Data
      5. Importing Data From Databases
      6. Getting Data from Hadoop
    2. 12. Preparing Data
      1. Combining Data Sets
      2. Transformations
      3. Binning Data
      4. Subsets
      5. Summarizing Functions
      6. Data Cleaning
      7. Finding and Removing Duplicates
      8. Sorting
  6. IV. Data Visualization
    1. 13. Graphics
      1. An Overview of R Graphics
      2. Graphics Devices
      3. Customizing Charts
    2. 14. Lattice Graphics
      1. History
      2. An Overview of the Lattice Package
      3. High-Level Lattice Plotting Functions
      4. Customizing Lattice Graphics
      5. Low-Level Functions
    3. 15. ggplot2
      1. A Short Introduction
      2. The Grammar of Graphics
      3. A More Complex Example: Medicare Data
      4. Quick Plot
      5. Creating Graphics with ggplot2
      6. Learning More
  7. V. Statistics with R
    1. 16. Analyzing Data
      1. Summary Statistics
      2. Correlation and Covariance
      3. Principal Components Analysis
      4. Factor Analysis
      5. Bootstrap Resampling
    2. 17. Probability Distributions
      1. Normal Distribution
      2. Common Distribution-Type Arguments
      3. Distribution Function Families
    3. 18. Statistical Tests
      1. Continuous Data
      2. Discrete Data
    4. 19. Power Tests
      1. Experimental Design Example
      2. t-Test Design
      3. Proportion Test Design
      4. ANOVA Test Design
    5. 20. Regression Models
      1. Example: A Simple Linear Model
      2. Details About the lm Function
      3. Subset Selection and Shrinkage Methods
      4. Nonlinear Models
      5. Survival Models
      6. Smoothing
      7. Machine Learning Algorithms for Regression
    6. 21. Classification Models
      1. Linear Classification Models
      2. Machine Learning Algorithms for Classification
    7. 22. Machine Learning
      1. Market Basket Analysis
      2. Clustering
    8. 23. Time Series Analysis
      1. Autocorrelation Functions
      2. Time Series Models
  8. VI. Additional Topics
    1. 24. Optimizing R Programs
      1. Measuring R Program Performance
      2. Optimizing Your R Code
      3. Other Ways to Speed Up R
    2. 25. Bioconductor
      1. An Example
      2. Key Bioconductor Packages
      3. Data Structures
      4. Where to Go Next
    3. 26. R and Hadoop
      1. R and Hadoop
      2. Other Packages for Parallel Computation with R
      3. Where to Learn More
  9. A. R Reference
    1. base
      1. Functions
      2. Data Sets
    2. boot
      1. Functions
      2. Data Sets
    3. class
      1. Functions
    4. cluster
      1. Functions
      2. Data Sets
    5. codetools
    6. foreign
      1. Functions
    7. grDevices
      1. Functions
      2. Data Sets
    8. graphics
      1. Functions
    9. grid
    10. KernSmooth
      1. Functions
    11. lattice
      1. Functions
      2. Data Sets
    12. MASS
      1. Functions
      2. Data Sets
    13. methods
      1. Functions
    14. mgcv
    15. nlme
    16. nnet
      1. Functions
    17. rpart
      1. Functions
      2. Data Sets
    18. spatial
      1. Functions
    19. splines
      1. Functions
    20. stats
      1. Functions
      2. Data Set
    21. stats4
      1. Functions
    22. survival
      1. Functions
      2. Data Sets
    23. tcltk
    24. tools
      1. Functions
      2. Data Sets
    25. utils
      1. Functions
  10. Bibliography
  11. Index
  12. About the Author
  13. Colophon
  14. Copyright
O'Reilly logo

Exploring Package Repositories

You can find thousands of R packages online. The two biggest sources of packages are CRAN (Comprehensive R Archive Network) and Bioconductor, but some packages are available elsewhere. (If you know Perl, you’ll notice that CRAN is very similar to CPAN, the Comprehensive Perl Archive Network.) CRAN is hosted by the R Foundation (the same nonprofit organization that oversees R development). The archive contains a very large number of packages (there were 1,698 packages on February 24, 2009), covering a wide number of different applications. CRAN is hosted on a set of mirror sites around the world. Try to pick an archive site near you: you’ll minimize download times and help reduce the server load on the R Foundation.

Bioconductor is an open-source project for building tools to analyze genomic data. Bioconductor tools are built using R and are distributed as R packages. The Bioconductor packages are distributed separately from R, and most are not available on CRAN. There are dozens of different packages available directly through the Bioconductor project.

R-Forge is another interesting place to look for packages. The R-Forge site contains projects that are in progress, and it provides tools for developers to collaborate. You may find some interesting packages on this site, but please be sure to read the disclaimers and documentation, because many of these packages are works in progress.

R includes the ability to download and install packages from other repositories. However, I don’t know of other public repositories for R packages. Most R projects simply use CRAN to host their packages. (I’ve even seen some books that use CRAN to distribute sample code and sample data.)

Exploring R Package Repositories on the Web

R provides good tools for installing packages within the GUI but doesn’t provide a good way to find a specific package. Luckily, it’s pretty easy to find a package on the Web.

You can browse through the set of available packages with your web browser. Here are some places to look for packages.

CRANSee for an authoritative list, but you should try to find your local mirror and use that site instead

However, you can also try to find packages with a search engine. I’ve had good luck finding packages by using Google to search for “R package” plus the name of the application. For example, searching for “R package multivariate additive regression splines” can help you find the mda package, which contains the mars function. (Of course, I discovered later that the earth package is a better choice for this algorithm, but we’ll get to that later.)

Finding and Installing Packages Inside R

Once you figure out what package you want to install, the easiest way to do it is inside R.

Windows and Linux GUIs

Installing packages through the Windows GUI is pretty straightforward.

  1. (Optional) By default, R is set to fetch packages from the “CRAN” and “CRAN (extra)” categories. To pick additional sets of packages, choose “Select repositories” from the Packages menu. You can choose multiple repositories.

  2. From the Packages menu, select “Install package(s)”.

  1. If this is the first time you are installing a package during this session, R will ask you to pick a mirror. (You’ll probably want to pick a site that is geographically close, because it’s likely to also be close on the Internet, and thus fast.)

  2. Click the name of the package that you want to install and press OK.

R will download and install the packages that you have selected.

Note that you may run into issues installing packages, depending on the permissions assigned to your user account. If you are using Windows XP, and your account is a member of the Administrators group, you should have no problems. If you are using Windows Vista, and you installed R in your own directory, you should have no issues. Otherwise, you may need to run R as an Administrator in order to install supplementary packages.


On Mac OS X, there is a slightly different user interface for package installation. It shows a little more information than the Windows version, but it’s a little more confusing to use.

  1. From the Package and Data menu, select Package Installer. (See Figure 4-1 for a picture of the installer window.)

  2. (Optional) In the top-left corner of the window is a menu that allows you to select the category of packages you would like to download. Initially, this is set to “CRAN (binaries).”

  3. Click the Get List button to display the available set of packages.

  4. You can use the search box to filter the list to show only packages that match the name you are looking for. (Note: you have to click the Get List button before the search will return results.)

  5. Select the set of packages that you want to install and press the Install Selected button.

By default, R will install packages at the system level, making them available to all users. If you do not have the appropriate permissions to install packages globally, or if you would like to install them elsewhere, then select an alternative location. Additionally, R will not install the additional packages on which your packages depend. You will get an error if you try to load a package and have not installed other packages on which it is dependent.

R console

You can also install R packages directly from the R console. Table 4-2 shows the set of commands for installing packages from the console. As a simple example, suppose that you wanted to install the packages tree and maptree. You could accomplish this with the following command:

> install.packages(c("tree","maptree"))
trying URL '
Content type 'application/x-gzip' length 103712 bytes (101 Kb)
opened URL
downloaded 101 Kb

trying URL '
Content type 'application/x-gzip' length 101577 bytes (99 Kb)
opened URL
downloaded 99 Kb

The downloaded packages are in

This will install the packages to the default library specified by the variable .Library. If you’d like to remove these packages after you’re done, you can use remove.packages. You need to specify the library where the packages were installed:

> remove.packages(c("tree", "maptree"),.Library)

Table 4-2. Common package installation commands

installed.packagesReturns a matrix with information about all currently installed packages.
available.packagesReturns a matrix of all packages available on the repository.
old.packagesReturns a matrix of all currently installed packages for which newer versions are available.
new.packagesReturns a matrix showing all currently uninstalled packages available from the package repositories.
download.packagesDownloads a set of packages to a local directory.
install.packagesInstalls a set of packages from the repository.
remove.packagesRemoves a set of installed packages.
update.packagesUpdates installed packages to the latest versions.
setRepositoriesSets the current list of package repositories.

Installing from the command line

You can also install downloaded packages from the command line. (There is actually a set of different commands that you can issue to R directly from the command line, without launching the full R shell.) To do this, you run R with the CMD INSTALL option. For example, suppose that you had downloaded the package aplpack (“Another Plotting PACKage”). For Mac OS X, the binary file is called aplpack_1.1.1.tgz. To install this package, change to the directory where the package is located and issue the following command:

$ R CMD INSTALL aplpack_1.1.1.tgz

If successful, you’ll see a message like the following:

* Installing to library '/Library/Frameworks/R.framework/Resources/library'
* Installing *binary* package 'aplpack' ...
* DONE (aplpack)

The best content for your career. Discover unlimited learning on demand for around $1/day.