Creating dummy variables

Creating dummy variables is a method to create separate variable for each category of a categorical variable., Although, the categorical variable contains plenty of information and might show a causal relationship with output variable, it can't be used in the predictive models like linear and logistic regression without any processing.

In our dataset, sex is a categorical variable with two categories that are male and female. We can create two dummy variables out of this, as follows:

dummy_sex=pd.get_dummies(data['sex'],prefix='sex')

The result of this statement is, as follows:

Creating dummy variables

Fig. 2.17: Dummy variable for the sex variable ...

Get Python: Data Analytics and Visualization now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.