Chapter 13

Creating Basic Examples of Unsupervised Predictions

IN THIS CHAPTER

Working with a sample dataset

Creating simple predictive models using clustering algorithms

Visualizing and evaluating your results

This chapter is about creating a few simple predictive models using unsupervised learning with clustering algorithms such as K-means, DBSCAN, and mean shift. These examples use the Python programming language, version 2.7.4, on a Windows machine. See Chapter 12 if you need instructions on installing Python and the scikit-learn machine-learning package.

No prior knowledge of supervised learning is required to understand the concepts of unsupervised learning. Supervised learning is when the output categories are known in the historical data; unsupervised learning is when the output categories are unknown. Chapter 12 covers examples of supervised learning with classification and regression algorithms.

You can read Chapters 12 and 13 independently. One advantage of reading both chapters in the same session is that you'll be able to reuse the work that you did to load the Iris dataset into the Python interpreter (the command line where you enter the code statements or commands). So if you're continuing from Chapter 12, you may skip the next section.

Getting the Sample Dataset

The sample Iris dataset is included in the installation of scikit-learn — along with a set of functions that load data into the Python session.

To load the Iris dataset, follow these steps:

  1. Open a ...

Get Predictive Analytics For Dummies, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.