Normalizing or standardizing data in a data frame

Distance computations play a big role in many data analytics techniques. We know that variables with higher values tend to dominate distance computations and you may want to use the standardized (or Z) values.

Getting ready

Download the BostonHousing.csv data file and store it in your R environment's working directory. Then read the data:

> housing <- read.csv("BostonHousing.csv")

How to do it...

To standardize all the variables in a data frame containing only numeric variables, use:

> housing.z <- scale(housing)

You can only use the scale() function on data frames containing all numeric variables. Otherwise, you will get an error.

How it works...

When invoked as above, the scale() function computes the ...

Get R: Recipes for Analysis, Visualization and Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.