We are all familiar with the software development cycle. The six steps Planning, Defining, Designing, Building, Testing and Deployment are etched into our brains. We design before we start implementing. When we design a data model like this, it gives us an understanding of the main business areas. But in the world of Big Data, data scientists dive right in with discovery and analysis. A “schema on read” is created that loads the data more or less as-is, and the eventual schema evolves, gets designed when the data are pulled from the big data environment. For most data scientists a “schema on read” satisfies their needs, but where is the single version of the truth?
It has been shown that Big Data can benefit from several aspects that an Enterprise Data Model can offer. The data governance set-up, defined business areas, and exposing business rules are an extremely valuable foundation for your Big Data environment.
How do we combine these two different views, and foster collaboration?