Benchmarking text file parsers

Another notable alternative for handling and loading reasonable sized data from flat files to R is the data.table package. Although it has a unique syntax differing from the traditional S-based R markup, the package comes with great documentation, vignettes, and case studies on the indeed impressive speedup it can offer for various database actions. Such uses cases and examples will be discussed in the Chapter 3, Filtering and Summarizing Data and Chapter 4, Restructuring Data.

The package ships a custom R function to read text files with improved performance:

> library(data.table)
> system.time(dt <- fread('hflights.csv'))
   user  system elapsed 
  0.153   0.003   0.158

Loading the data was extremely quick compared to the ...

Get Mastering Data Analysis with R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.