Chapter 5. CascalogâA Clojure DSL for Cascading
Why Use Cascalog?
Sometimes the tools we select change the way we approach a problem. As the proverb goes, if all you have is a hammer, everything looks like a nail. And sometimes our tools, over time, actually interfere with the process of solving a problem.
For most of the past three decades, SQL has been synonymous with database work. A couple of generations of programmers have grown up with relational databases as the de facto standard. Consider that while âNoSQLâ has become quite a popular theme, most vendors in the Big Data space have been rushing (circa 2013Q1) to graft SQL features onto their frameworks.
Looking back four decades to the origins of the relational modelâin the 1970 paper by Edgar Codd, âA Relational Model of Data for Large Shared Data Banksââthe point was about relational models and not so much about databases and tables and structured queries. Codd himself detested SQL. The relational model was formally specified as a declarative âdata sublanguageâ (i.e., to be used within some other host language) based on first-order predicate logic. SQL is not that. In comparison, it forces programmers to focus largely on control flow issues and the structure of tablesâto a much greater extent than the relational model intended. SQLâs semantics are also disjoint from the programming languages in which it gets used: Java, C++, Ruby, PHP, etc. For that matter, the term ârelationalâ no longer even appears ...
Get Enterprise Data Workflows with Cascading now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.