R includes a variety of functions for calculating summary statistics.

To calculate the mean of a vector, use the `mean`

function. You can calculate minima with the `min`

function, or maxima with the `max`

function. As an example, let’s use the `dow30`

data set that we created in An extended example. This data set is also available in
the `nutshell`

package:

>library(nutshell)>data(dow30)>mean(dow30$Open)[1] 36.24574 >min(dow30$Open)[1] 0.99 >max(dow30$Open)[1] 122.45

For each of these functions, the argument `na.rm`

specifies how `NA`

values are treated. By default, if any value
in the vector is `NA`

, then the value
`NA`

is returned. Specify `na.rm=TRUE`

to ignore missing values:

>mean(c(1, 2, 3, 4, 5, NA))[1] NA >mean(c(1, 2, 3, 4, 5, NA), na.rm=TRUE)[1] 3

Optionally, you can also remove outliers when using the `mean`

function. To do this, use the `trim`

argument to specify the fraction of
observations to filter:

>mean(c(-1, 0:100, 2000))[1] 68.4369 >mean(c(-1, 0:100, 2000), trim=0.1)[1] 50

To calculate the minimum and maximum at the same time, use the
`range`

function. This returns a vector with the minimum and
maximum value:

`> `**range(dow30$Open)**
[1] 0.99 122.45

Another useful function is `quantile`

. This function can be used to return the values at
different percentiles (specified by the `probs`

argument):

`> `**quantile(dow30$Open, probs=c(0, 0.25, 0.5, 0.75, 1.0))**
0% 25% 50% 75% 100%
0.990 19.655 30.155 51.680 122.450

You can return this specific set of values (minimum, 25th percentile, ...

Start Free Trial

No credit card required