Often, you are provided with data that is too fine grained for your analysis. For example, you might be analyzing data about a website. Suppose that you wanted to know the average number of pages delivered to each user. To find the answer, you might need to look at every HTTP transaction (every request for content), grouping together requests into sessions and counting the number of requests. R provides a number of different functions for summarizing data, aggregating records together to build a smaller data set.
tapply function is
a very flexible function for summarizing a vector
X. You can specify which subsets of
X to summarize, as well as the function used
tapply(X, INDEX, FUN = , ..., simplify = )
Here are the arguments to
|X||The object on which to apply the function (usually a vector).|
|INDEX||A list of factors that specify different sets of values
|FUN||The function applied to elements of |
|...||Optional arguments are passed to |
For example, we can use
to sum the number of home runs by team:
> tapply(X=batting.2008$HR, INDEX=list(batting.2008$teamID), FUN=sum) ARI ATL BAL BOS CHA CHN CIN CLE COL DET FLO HOU KCA LAA LAN MIL MIN 159 130 172 173 235 184 187 171 ...