Chapter 1. Data Exploration with RMS Titanic

In this chapter, we will cover the following recipes:

  • Reading a Titanic dataset from a CSV file
  • Converting types on character variables
  • Detecting missing values
  • Imputing missing values
  • Exploring and visualizing data
  • Predicting passenger survival with a decision tree
  • Validating the power of prediction with a confusion matrix
  • Assessing performance with the ROC curve

Introduction

Data exploration helps a data consumer to focus on searching for information, with a view to forming a true analysis from the gathered information. Furthermore, with the completion of the steps of data munging, analysis, modeling, and evaluation, users can generate insights and valuable points from their focused data.

In a real data exploration ...

Get R: Recipes for Analysis, Visualization and Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.