Bibliography

Armbrust, Michael, Reynold S. Xin, Cheng Lian, Yin Huai, Davies Liu, Joseph K. Bradley, Xiangrui Meng, Tomer Kaftan, Michael J. Franklin, Ali Ghodsi, Matei Zaharia. Spark SQL: Relational Data Processing in Spark. https://amplab.cs.berkeley.edu/publication/spark-sql-relational-data-processing-in-spark.

Avro. http://avro.apache.org/docs/current.

Ben-Hur, Asa and Jason Weston. A User’s Guide to Support Vector Machines. http://pyml.sourceforge.net/doc/howto.pdf.

Breiman, Leo. Random Forests. https://www.stat.berkeley.edu/~breiman/randomforest2001.pdf.

Cassandra. http://cassandra.apache.org.

Chang, Fay, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber. ...

Get Big Data Analytics with Spark: A Practitioner’s Guide to Using Spark for Large-Scale Data Processing, Machine Learning, and Graph Analytics, and High-Velocity Data Stream Processing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.