Chapter 3. Data Preprocessing and Preparation

This chapter covers the following topics:

  • Renaming the data variable
  • Converting data types
  • Working with the date format
  • Adding new records
  • Filtering data
  • Dropping data
  • Merging data
  • Sorting data
  • Reshaping data
  • Detecting missing data
  • Imputing missing data

Introduction

In the previous chapter, we covered how to integrate data from various data sources. However, simply collecting data is not enough; you also have to ensure the quality of the collected data. If the quality of data used is insufficient, the results of the analysis may be misleading due to biased samples or missing values. Moreover, if the collected data is not well structured and shaped, you may find it hard to correlate and investigate the data. Therefore, ...

Get R for Data Science Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.