Sometimes, we may wish to split data into subsets and apply a function such as the mean, max, or min to each subset. In R, we can do this via the
Here, we will use the example of a dataset of statistics on the top five strikers of the four clubs that made it to the semi-final of the European Champions League Football tournament in 2014. We will use it to illustrate aggregation in R and its equivalent GroupBy functionality in pandas.
In R aggregation is done using the following command:
> goal_stats=read.csv('champ_league_stats_semifinalists.csv') >goal_stats Club Player Goals GamesPlayed 1 Atletico Madrid Diego Costa 8 9 2 Atletico Madrid ArdaTuran 4 9 3 Atletico Madrid RaúlGarcía ...