Download the sample dataset

Download the sample dataset from GitHub with the following command on any server that has access to the MySQL database:

[user@master ~]$ sudo yum install git -y
[user@master ~]$ git clone https://github.com/datacharmer/test_dbCloning into 'test_db'...remote: Counting objects: 98, done.remote: Total 98 (delta 0), reused 0 (delta 0), pack-reused 98Unpacking objects: 100% (98/98), done.

Get Modern Big Data Processing with Hadoop now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.