Grouping the data – aggregation, filtering, and transformation

In this section, you will learn how to aggregate data over categorical variables. This is a very common practice when the data consists of categorical variables. This analysis enables us to conduct a category-wise analysis and take further decisions regarding the modelling.

To illustrate the concepts of grouping and aggregating data better, let's create a simple dummy data frame that has a rich mix of both numerical and categorical variables. Let's use whatever we have explored till now about random numbers to create this data frame, as shown in the following snippet:

import numpy as np import pandas as pd a=['Male','Female'] b=['Rich','Poor','Middle Class'] gender=[] seb=[] for i in ...

Get Learning Predictive Analytics with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.