O'Reilly logo

Python for Data Science For Dummies by Luca Massaron, John Paul Mueller

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 14

Reducing Dimensionality

In This Chapter

arrow Discovering the magic of singular value decomposition

arrow Understanding the difference between factors and components

arrow Matching unknown images to known ones

arrow Automatically retrieving topics from texts

arrow Building a movie recommender system

Big data is defined as a collection of datasets that is so huge that it becomes difficult to process using traditional techniques. The manipulation of big data differentiates statistical problems, which are based on small samples, from data science problems. You typically use traditional statistical techniques on small problems and data science techniques on big problems.

Data may be viewed as big because it consists of many examples, and this is the first kind of big that spontaneously comes to mind. Analyzing a database of millions of customers and interacting with them all simultaneously is really challenging, but that isn’t the only possible perspective of big data.

Another potential view of big data ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required