Hyper parameters

We have glossed over an important aspect: model tuning. As you can see, there are many parameters that can be tuned, depending on the algorithm. And we have been setting the parameters once. For example, in the case of the recommender, we set rank=12, regularizationParameter=0.1, and maxIterations=20. In reality, the rank could be 8 or 12; the regularization parameter 0.1,1.0, or 10; and the iterations 10 or 20. So now we need to try 12 runs with these different values, calculate the accuracy, and then select the one with the best value. This is a simple case; we might have more than 100 runs and many parameters. This is where cross validation comes into the picture. To keep this book within its boundaries, I will leave this part ...

Get Fast Data Processing with Spark 2 - Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.