Once we have all the Druid-related applications running in our Hadoop cluster, we need a sample dataset that we must load in order to run some analytics tasks.
Let's see how to load sample data. Download the Druid archive from the internet:
[druid@node-3 ~$ curl -O http://static.druid.io/artifacts/releases/druid-0.12.0-bin.tar.gz% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed100 222M 100 222M 0 0 1500k 0 0:02:32 0:02:32 --:--:-- 594k
Extract the archive:
[druid@node-3 ~$ tar -xzf druid-0.12.0-bin.tar.gz
Copy the sample Wikipedia data to Hadoop:
[druid@node-3 ~]$ cd druid-0.12.0[druid@node-3 ~/druid-0.12.0]$ hadoop fs -mkdir /user/druid/quickstart[druid@node-3 ...