In Search of Database Nirvana

The Swinging Database Pendulum

It often seems like the IT industry sways back and forth on technology decisions.

About a decade ago, new web-scale companies were gathering more data than ever before and needed new levels of scale and performance from their data systems. There were Relational Database Management Systems (RDBMSs) that could scale on Massively-Parallel Processing (MPP) architectures, such as the following:

  • NonStop SQL/MX for Online Transaction Processing (OLTP) or operational workloads

  • Teradata and HP Neoview for Business Intelligence (BI)/Enterprise Data Warehouse (EDW) workloads

  • Vertica, Aster Data, Netezza, Greenplum, and others, for analytics workloads

However, these proprietary databases shared some unfavorable characteristics:

  • They were not cheap, both in terms of software and specialized hardware.

  • They did not offer schema flexibility, important for growing companies facing dynamic changes.

  • They could not scale elastically to meet the high volume and velocity of big data.

  • They did not handle semistructured and unstructured data very well. (Yes, you could stick that data into an XML, BLOB, or CLOB column, but very little was offered to process it easily without using complex syntax. Add-on capabilities had vendor tie-ins and minimal flexibility.)

  • They had not evolved User-Defined Functions (UDFs) beyond scalar functions, which limited parallel processing of user code facilitated later by Map/Reduce.

  • They took ...

Get In Search of Database Nirvana now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.