The raw data pond is what many organizations initially call the data lake. Too often, they’ll simply throw data into the lake and then wonder why they can’t do any meaningful analytic processing against the data. In fairness, analytical processing can be done against raw data in the data lake. It just requires a data scientist to do the analysis. But much more lucid and efficient data analysis can be done against data after it has been conditioned. Almost as important, once the data has been conditioned, it can then be analyzed by the ordinary business user.
- Chapter 4 Data Ponds
- from Data Lake Architecture: Designing the Data Lake and Avoiding the Garbage Dump
- Publisher: Technics Publications
- Released: April 2016
Raw data pond or data lake
Share this highlighthttp://www.safaribooksonline.com/a/data-lake-architecture/7948793/