Chapter 8. Data Import with readr

Introduction

Working with data provided by R packages is a great way to learn the tools of data science, but at some point you want to stop learning and start working with your own data. In this chapter, you’ll learn how to read plain-text rectangular files into R. Here, we’ll only scratch the surface of data import, but many of the principles will translate to other forms of data. We’ll finish with a few pointers to packages that are useful for other types of data.

Prerequisites

In this chapter, you’ll learn how to load flat files in R with the readr package, which is part of the core tidyverse.

library(tidyverse)

Getting Started

Most of readr’s functions are concerned with turning flat files into data frames:

  • read_csv() reads comma-delimited files, read_csv2() reads semicolon-separated files (common in countries where , is used as the decimal place), read_tsv() reads tab-delimited files, and read_delim() reads in files with any delimiter.

  • read_fwf() reads fixed-width files. You can specify fields either by their widths with fwf_widths() or their position with fwf_positions(). read_table() reads a common variation of fixed-width files where columns are separated by white space.

  • read_log() reads Apache style log files. (But also check out webreadr, which is built on top of read_log() and provides many more helpful tools.)

These functions all have similar syntax: once you’ve mastered one, you can use the others with ease. ...

Get R for Data Science now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.