Multivariate regression

It is possible to minimize multiple metrics at the same time. While Spark only has a few multivariate analysis tools, other more traditional well-established packages come with Multivariate Analysis of Variance (MANOVA), a generalization of Analysis of Variance (ANOVA) method. I will cover ANOVA and MANOVA in Chapter 7, Working with Graph Algorithms.

For a practical analysis, we first need to understand if the target variables are correlated, for which we can use the PCA Spark implementation covered in Chapter 3, Working with Spark and MLlib. If the dependent variables are strongly correlated, maximizing one leads to maximizing the other, and we can just maximize the first principal component (and potentially build a regression ...

Get Mastering Scala Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.