dplyr versus data.table

You might now be wondering, "which package should we use?"

The dplyr and data.table packages provide a spectacularly different syntax and a slightly less determinative difference in performance. Although data.table seems to be slightly more effective on larger datasets, there is no clear winner in this spectrum—except for doing aggregations on a high number of groups. And to be honest, the syntax of dplyr, provided by the magrittr package, can be also used by the data.table objects if needed.

Also, there is another R package that provides pipes in R, called the pipeR package, which claims to be a lot more effective on larger datasets than magrittr. This performance gain is due to the fact that the pipeR operators do not try ...

Get Mastering Data Analysis with R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.