Architectural considerations

A modern Data Lake based on Hadoop is now mainstream technology and is used in several public sectors and enterprises; however, the ecosystem is still evolving and new tools and projects are released every quarter. In the next few sections, I will highlight the key architectural considerations to ensure that your Data Lake is well-planned and extensible for years to come.

Extensible and modular

In Chapter 2, Enterprise Data Lake using HDInsight, we looked into the reference architecture for the next generation Data Lake, as shown in the following figure. Use this architecture to design and build reusable components and well-defined interfaces between each layer, which allows a pluggable model. Let's take an example for ...

Get HDInsight Essentials - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.