Hive and Impala

One of the design considerations for a new framework is always the compatibility with the old frameworks. For better or worse, most data analysts still work with SQL. The roots of the SQL go to an influential relational modeling paper (Codd, Edgar F (June 1970). A Relational Model of Data for Large Shared Data Banks. Communications of the ACM (Association for Computing Machinery) 13 (6): 377–87). All modern databases implement one or another version of SQL.

While the relational model was influential and important for bringing the database performance, particularly for Online Transaction Processing (OLTP) to the competitive levels, the significance of normalization for analytic workloads, where one needs to perform aggregations, ...

Get Mastering Scala Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.