Appendix B. Dataset – Survival of Passengers on the Titanic

Before the exploration process, we would like to introduce the example adopted here. It is the demographic information on passengers aboard the RMS Titanic, provided by Kaggle (, a platform for data prediction competitions). The result we are examining is whether passengers on board would survive the shipwreck or not.

There are two reasons to apply this dataset:

  • RMS Titanic is considered as the most infamous shipwreck in history, with a death toll of up to 1,502 out of 2,224 passengers and crew. However, after the ship sank, the passengers' chance of survival was not by chance only; actually, the cabin class, sex, age, and other factors might also have affected their ...

Get R: Recipes for Analysis, Visualization and Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.