R cluster analysis

In this example, we use R's cluster analysis functions to determine the clustering in the wheat dataset from http://www.ics.uci.edu/.

The R script we want to use in Jupyter is the following:

# load the wheat data set from uci.edu
wheat <- read.csv("http://archive.ics.uci.edu/ml/machine-learning-databases/00236/seeds_dataset.txt", sep="\t")
# define useful column names
colnames(wheat) <-c("area", "perimeter", "compactness", "length", "width", "asymmetry", "groove", "undefined")
# exclude incomplete cases from the data
wheat <- wheat[complete.cases(wheat),]
# calculate the clusters
fit <- kmeans(wheat, 5)
fit

Once entered into a notebook, we have something like this:

The resulting generated cluster information is K-means clustering ...

Get Learning Jupyter now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Learning Jupyter by Dan Toomey

R cluster analysis

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly