Skip to main content

Get full access to K-means and hierarchical clustering with Python and 60K+ other titles, with a free 10-day trial of O'Reilly.

There are also live events, courses curated by job role, and more.

Start your free trial

K-means and hierarchical clustering with Python

K-means and hierarchical clustering with Python

by Joel Grus

Released August 2016

Publisher(s): O'Reilly Media, Inc.

ISBN: 9781491966174

Start your free trial

Book description

Clustering is the usual starting point for unsupervised machine learning. This lesson introduces the k-means and hierarchical clustering algorithms, implemented in Python code.

Why is it important?

Whenever you look at a data source, it's likely that the data will somehow form clusters. Datasets with higher dimensions become increasingly more difficult to "eyeball" based on human perception and intuition. These clustering algorithms allow you to discover similarities within data at scale, without first having to label a large training dataset.

What you'll learn—and how you can apply it

Understand how the k-means and hierarchical clustering algorithms work. Create classes in Python to implement these algorithms, and learn how to apply them in example applications. Identify clusters of similar inputs, and find a representative value for each cluster. Prepare to use your own implementations or reuse algorithms implemented in scikit-learn.

This lesson is for you because…

People interested in data science need to learn how to implement k-means and bottom-up hierarchical clustering algorithms

Prerequisites

Some experience writing code in Python
Experience working with data in vector or matrix format

Materials or downloads needed in advance

Download this code, where you'll find this lesson's code in Chapter 19, plus you'll need the linear_algebra functions from Chapter 4.

This lesson is taken from Data Science from Scratch by Joel Grus.

Publisher resources

View/Submit Errata

Table of contents

K-means and hierarchical clustering with Python

Product information

Title: K-means and hierarchical clustering with Python
Author(s): Joel Grus
Release date: August 2016
Publisher(s): O'Reilly Media, Inc.
ISBN: 9781491966174

You might also like

book

Machine Learning for Time-Series with Python

by Ben Auffarth

Get better insights from time-series data and become proficient in model performance analysis Key Features Explore …

book

Data Analysis with Python and PySpark

by Jonathan Rioux

Think big about your data! PySpark brings the powerful Spark big data processing engine to the …

video

Machine Learning, Data Science and Generative AI with Python

by Frank Kane

This course begins with a Python crash course and then guides you on setting up Microsoft …

book

Introduction to Machine Learning with Python

by Andreas C. Müller, Sarah Guido

Machine learning has become an integral part of many commercial applications and research projects, but this …

Don’t leave empty-handed

Get Mark Richards’s Software Architecture Patterns ebook to better understand how to design components—and how they should interact.

It’s yours, free.

Get it now

Cover of Software Architecture Patterns

Check it out now on O’Reilly

Dive in for free with a 10-day trial of the O’Reilly learning platform—then explore all the other resources our members count on to build skills and solve problems every day.

Start your free trial Become a member now