Data dimensionality reduction

So far in this chapter, we have looked at the basic concepts of supervised and unsupervised learning with the simplest possible examples. In these examples, we have considered a limited number of factors that contribute to the outcome. However, in the real world, we have a very large number of data points that are available for analysis and model generation. Every additional factor adds one dimension within the space, and beyond the third dimension, it becomes difficult to effectively visualize the data in a conceivable form. With each new dimension, there is a performance impact on the model generation exercise.

In the world of big data, where we now have the capability to bring in data from heterogeneous data ...

Get Artificial Intelligence for Big Data now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.