O'Reilly logo

Python Machine Learning Cookbook by Prateek Joshi

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Automatically estimating the number of clusters using DBSCAN algorithm

When we discussed the k-means algorithm, we saw that we had to give the number of clusters as one of the input parameters. In the real world, we wouldn't have this information available. We can definitely sweep the parameter space to find out the optimal number of clusters using the silhouette coefficient score, but this would be an expensive process! Wouldn't it be nice if there were a method that can just tell us the number of clusters in our data? This is where Density-Based Spatial Clustering of Applications with Noise (DBSCAN) comes into the picture.

This works by treating datapoints as groups of dense clusters. If a point belongs to a cluster, then there should be a lot ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required