Creating a titanic database

We are going to start from scratch and go back to the original Titanic dataset available at https://github.com/alexperrier/packt-aml/blob/master/ch4/original_titanic.csv. Follow these steps to prepare the CSV file:

  1. Open the original_titanic.csv file.
  2. Remove the header row.
  3. Remove the following punctuation characters: ,"().

The file should only contain data, not column names. This is the original file with 1309 rows. These rows are ordered by pclass and alphabetical names. The resulting file is available at https://github.com/alexperrier/packt-aml/blob/master/ch4/titanic_for_athena.csv. Let us create a new athena_data folder in our S3 bucket and upload the titanic_for_athena.csv file. Now go to the Athena console. ...

Get Effective Amazon Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.