Summary

To date, SparkR does not support all algorithms available in Spark, but active development is happening to bridge the gap. The Spark 2.0 release has improved algorithm coverage, including Naïve Bayes, k-means clustering, and survival regression. Check out the latest documentation for the supported algorithms. More work is underway in bringing out a CRAN release of SparkR, with better integration with R packages and Spark packages, and better RFormula support.

Get Spark for Data Science now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.