Cover by Joseph Adler

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

O'Reilly logo

Summarizing Functions

Often, you are provided with data that is too fine grained for your analysis. For example, you might be analyzing data about a website. Suppose that you wanted to know the average number of pages delivered to each user. To find the answer, you might need to look at every HTTP transaction (every request for content), grouping together requests into sessions and counting the number of requests. R provides a number of different functions for summarizing data, aggregating records together to build a smaller data set.

tapply, aggregate

The tapply function is a very flexible function for summarizing a vector X. You can specify which subsets of X to summarize, as well as the function used for summarization:

tapply(X, INDEX, FUN = , ..., simplify = )

Here are the arguments to tapply.

ArgumentDescriptionDefault
XThe object on which to apply the function (usually a vector). 
INDEXA list of factors that specify different sets of values of X over which to calculate FUN, each the same length as X. 
FUNThe function applied to elements of X.NULL
...Optional arguments are passed to FUN. 
simplifyIf simplify=TRUE, then if FUN returns a scalar, then tapply returns an array with the mode of the scalar. If simplify=FALSE, then tapply returns a list.TRUE

For example, we can use tapply to sum the number of home runs by team:

> tapply(X=batting.2008$HR, INDEX=list(batting.2008$teamID), FUN=sum) ARI ATL BAL BOS CHA CHN CIN CLE COL DET FLO HOU KCA LAA LAN MIL MIN 159 130 172 173 235 184 187 171 ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required