R cluster analysis
In this example, we use R's cluster analysis functions to determine the clustering in the wheat dataset from http://www.ics.uci.edu/.
The R script we want to use in Jupyter is the following:
# load the wheat data set from uci.edu wheat <- read.csv("http://archive.ics.uci.edu/ml/machine-learning-databases/00236/seeds_dataset.txt", sep="\t") # define useful column names colnames(wheat) <-c("area", "perimeter", "compactness", "length", "width", "asymmetry", "groove", "undefined") # exclude incomplete cases from the data wheat <- wheat[complete.cases(wheat),] # calculate the clusters fit <- kmeans(wheat, 5) fit
Once entered into a notebook, we have something like this:
The resulting generated cluster information is K-means clustering ...
Get Learning Jupyter now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.