Join data scientist Kelly O'Briant for an exploration of sparklyr, the package from RStudio which provides an interface to Apache Spark from R. For many data scientists who rely on R for their work, the paradigm shift from local in-memory computations to scalable distributed data processing can be complicated to navigate. This course provides an easy-to-follow R based method for working with big data. You'll connect to Spark, run some sparklyr code, and explore some practical applications of Spark SQL and sparklyr functionality. You'll wrap up by performing some exploratory analysis and feature generation using a Kaggle competition data set. Learners should have a moderate level of experience with doing data science tasks or workflows in R.
Kelly O'Briant is a data scientist and lead R developer with Washington DC based B23 LLC. She holds degrees in Computational Science and Informatics from George Mason University, and Bioinformatics from Virginia Commonwealth University. Kelly is a founder and co-organizer of the Washington DC chapter of R-Ladies Global. She gives talks on R cloud computing, R data products, and sparklyr at R-Ladies meetups and R conferences.