Cover by Joseph Adler

Safari, the world’s most comprehensive technology and business learning platform.

Find the exact information you need to solve a problem on the fly, or go deeper to master the technologies and skills you need to succeed

Start Free Trial

No credit card required

O'Reilly logo

Summary Statistics

R includes a variety of functions for calculating summary statistics.

To calculate the mean of a vector, use the mean function. You can calculate minima with the min function, or maxima with the max function. As an example, let’s use the dow30 data set that we created in An extended example. This data set is also available in the nutshell package:

> library(nutshell)
> data(dow30)
> mean(dow30$Open)
[1] 36.24574
> min(dow30$Open)
[1] 0.99
> max(dow30$Open)
[1] 122.45

For each of these functions, the argument na.rm specifies how NA values are treated. By default, if any value in the vector is NA, then the value NA is returned. Specify na.rm=TRUE to ignore missing values:

> mean(c(1, 2, 3, 4, 5, NA))
[1] NA
> mean(c(1, 2, 3, 4, 5, NA), na.rm=TRUE)
[1] 3

Optionally, you can also remove outliers when using the mean function. To do this, use the trim argument to specify the fraction of observations to filter:

> mean(c(-1, 0:100, 2000))
[1] 68.4369
> mean(c(-1, 0:100, 2000), trim=0.1)
[1] 50

To calculate the minimum and maximum at the same time, use the range function. This returns a vector with the minimum and maximum value:

> range(dow30$Open)
[1]   0.99 122.45

Another useful function is quantile. This function can be used to return the values at different percentiles (specified by the probs argument):

> quantile(dow30$Open, probs=c(0, 0.25, 0.5, 0.75, 1.0))
     0%     25%     50%     75%    100%
  0.990  19.655  30.155  51.680 122.450

You can return this specific set of values (minimum, 25th percentile, ...

Find the exact information you need to solve a problem on the fly, or go deeper to master the technologies and skills you need to succeed

Start Free Trial

No credit card required