Principal Components Analysis

Another technique for analyzing data is principal components analysis. Principal components analysis breaks a set of (possibly correlated) variables into a set of uncorrelated variables.

In R, principal components analysis is available through the function prcomp in the stats package:

## S3 method for class 'formula':
prcomp(formula, data = NULL, subset, na.action, ...)

## Default S3 method:
prcomp(x, retx = TRUE, center = TRUE, scale. = FALSE,
       tol = NULL, ...)

Here is a description of the arguments to prcomp.

Argument	Description	Default
formula	In the formula method, specifies formula with no response variable, indicating columns of a data frame to use in the analysis.
data	An optional data frame containing the data specified in formula.
subset	An (optional) vector specifying observations to include in the analysis.
na.action	A function specifying how to deal with `NA` values.
x	In the default method, specifies a numeric or complex matrix of data for the analysis.
retx	A logical value specifying whether rotated variables should be returned.	TRUE
center	A logical value specifying whether values should be zero centered.	TRUE
scale.	A logical value specifying whether values should be scaled to have unit variance.	TRUE
tol	A numeric value specifying a tolerance value below which components should be omitted.	NULL
...	Additional arguments passed to other methods.

As an example, let’s try principal components analysis on a matrix of team batting statistics. Let’s start by loading ...

Get R in a Nutshell now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

R in a Nutshell by Joseph Adler

Principal Components Analysis

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly