Clustering versus classification

It might be confusing for beginners to distinguish between a clustering problem and a classification problem. Classification is fundamentally different from clustering. Classification is a supervised learning problem where your class or target variable is known to train a dataset. The algorithm is trained to look at the examples (features and class or target variables) and then you score and test it with a test dataset.

Clustering, being an unsupervised learning, it works on a dataset with no label or class variable. Also, you don't perform scoring and testing with a test dataset. So, you just apply your algorithm to your data and group them into a different cluster, say 1, 2 and 3, which were not known before. ...

Get Microsoft Azure Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.