Data preprocessing

In the data preprocessing step, we will be focusing on two things mainly: data type transformations and data normalization. Finally we will split the data into training and testing datasets for predictive modeling. You can access the code for this section in the data_preparation.R file. We will be using some utility functions, which are mentioned in the following code snippet. Remember to load them up in memory by running them in the R console:

## data type transformations - factoring
to.factors <- function(df, variables){
  for (variable in variables){
    df[[variable]] <- as.factor(df[[variable]])
  }
  return(df)
}

## normalizing - scaling
scale.features <- function(df, variables){
  for (variable in variables){
 df[[variable]] ...

Get R Machine Learning By Example now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.