Using data.table for data manipulation

One of the most efficient packages for data mining and manipulation in R is data.table. Developed by Matt Dowle and Arun Srinivasan, data.table has consistently outperformed other contemporary R packages in general day-to-day data analysis operations.

The only caveat to using data.table is the fact that its behavior is slightly different from data.frame in terms of the syntax used for subsetting and other operations. That said, the benefits of using data.table greatly outweighs the slightly extra effort required to learn the package.

data.table can be installed using the following code:

install.packages("data.table")library(data.table) 

The general form of data.table operations is as follows:

dt[i, j, ...

Get Hands-On Data Science with R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.