Chapter 12. Getting Data

Data can come from many sources. R comes with many datasets built in, and there is more data in many of the add-on packages. R can read data from a wide variety of sources and in a wide variety of formats. This chapter covers importing data from text files (including spreadsheet-like data in comma- or tab-delimited format, XML, and JSON), binary files (Excel spreadsheets and data from other analysis software), websites, and databases.

Chapter Goals

After reading this chapter, you should:

  • Be able to access datasets provided with R packages
  • Be able to import data from text files
  • Be able to import data from binary files
  • Be able to download data from websites
  • Be able to import data from a database

Built-in Datasets

One of the packages in the base R distribution is called datasets, and it is entirely filled with example datasets. While you’ll be lucky if any of them are suited to your particular area of research, they are ideal for testing your code and for exploring new techniques. Many other packages also contain datasets. You can see all the datasets that are available in the packages that you have loaded using the data function:

data()

For a more complete list, including data from all packages that have been installed, use this invocation:

data(package = .packages(TRUE))

To access the data in any of these datasets, call the data function, this time passing the name of the dataset and the package where the data is found (if the package has been loaded, then you ...

Get Learning R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.