How to do it...

  1. You can start by downloading the dataset using either two of the following commands:
wget https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data

You can also use the following command:

curl https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data -o iris.data

You can also use the following command:

https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data
  1. Now we begin our first step of data exploration by examining how the data in iris.data is formatted:
head -5 iris.data5.1,3.5,1.4,0.2,Iris-setosa4.9,3.0,1.4,0.2,Iris-setosa4.7,3.2,1.3,0.2,Iris-setosa4.6,3.1,1.5,0.2,Iris-setosa5.0,3.6,1.4,0.2,Iris-setosa
  1. Now we take a look at the iris data to know how it is formatted:

Get Apache Spark 2.x Machine Learning Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.