Summary

This completes the overview of three of the most commonly used unsupervised learning techniques:

  • K-means for clustering fully observed features of a model with reasonable dimensions
  • Expectation-maximization for clustering a combination of observed and latent features
  • Principal components analysis to transform and extract the most critical features in terms of variance

The key point to remember is that unsupervised learning techniques are used:

  • By themselves to extract structures and associations from unlabelled observations
  • As a preprocessing stage to supervised learning in reducing the number of features prior to the training phase

In the next chapter, we will address the second use case, and cover supervised learning techniques starting with ...

Get Scala: Guide for Data Science Professionals now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.