Chapter 2. Data Extracting, Transforming, and Loading

This chapter covers the following topics:

  • Downloading open data
  • Reading and writing CSV files
  • Scanning text files
  • Working with Excel files
  • Reading data from databases
  • Scraping web data
  • Accessing Facebook data
  • Working with twitteR

Introduction

Before using data to answer critical business questions, the most important thing is to prepare it. Data is normally archived in files, and using Excel or text editors allows it to be easily obtained. However, data can be located in a range of different sources, such as databases, websites, and various file formats. Being able to import data from these sources is crucial.

There are four main types of data. Data recorded in text format is the simplest. As some users ...

Get R for Data Science Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.