Binning the observations

Binning the observations comes in handy when we want to check the shape of the distribution visually or we want to transform the data into an ordinal form.

Getting ready

To execute this recipe, you will need the pandas and NumPy modules.

No other prerequisites are required.

How to do it…

To bin your observations (as in a histogram), you can use the following code (data_binning.py file):

# create bins for the price that are based on the
# linearly spaced range of the price values
bins = np.linspace(
    csv_read['price_mean'].min(),
    csv_read['price_mean'].max(),
    6
)

# and apply the bins to the data
csv_read['b_price'] = np.digitize(
    csv_read['price_mean'],
    bins
)

How it works…

First, we create bins. For our price (with the mean imputed ...

Get Practical Data Analysis Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.