Mixed data types

DataFrames in R require a single data type per column of data. The same column may not contain both numeric and character data and when that happens, R coerces the column using the sequence shown as follows:

Logical à Integer à Double à Character

What this means is that, if say a column contains numeric (integer or double) values and character strings, R will coerce the column to be a character column. We can see this by using the typeof command:

> typeof(c(1,2,"a")) 
[1] "character" 

A dataset containing the symbol $ in an amount field for instance, would be interpreted as a character column even though the column was intended to be numeric. In such cases, it would be essential to leverage string operations in R to cleanse ...

Get Hands-On Data Science with R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.