Chapter 4. Data modeling: End to end 87
4.2.2 Define the granularity
Stage two of the process is to determine the right level of detail that must be
contained within the model and ultimately, within the fact table. As an example, a
model might contain a complete record of a sales transaction as one entry within
the fact table, with all the products sold contained within the one entry or maybe
not even listed. In this case, the lowest level is be the individual sales transaction.
An alternative can be to set the product as the lowest level, thus requiring
multiple entries within the fact table to represent just one sales transaction. The
grain (detail) of the fact table has a significant impact on the data warehouse
from business, technical, and development points of view.
From a business perspective, a dimensional model designed to the lowest level
of grain is easy to change, with new dimensions or facts being easily added.
Thus, the model can maintain future capability and be flexible enough to answer
questions at a lower level of detail.
From a technical perspective, the grain of a table has a major impact on the size
of the star schema and thus the size of the tables. This has an effect on table
implementation and the performance of transform and query operations.
From a development perspective, the development team might have to consider
more dimensions and have a greater understanding of the underlying production
tables. They might also have to perform a greater number of, and more complex,
transforms for getting the data into the warehouse.
At this point in the modelling process it is worth investigating documents and
account forms from within the business; items such as order forms, invoices, and
sales receipts can help determine an acceptable grain level. You also must
consider time- and date-dependent information, because many of the business
questions that you aim to answer can be time related. An example might be
“What was the number of sales of a particular item on a particular date.” In this
scenario, you have to model time as one of your dimensions. You do, however,
have to check that the source data does contain information at the day, month, or
year level.
4.2.3 Identify the dimension tables
Dimensional tables are a major factor in the construction of a Star Schema
dimensional model. As illustrated in Figure 4-2 on page 84, one or more
dimensional tables are connected to the fact table, with each normally containing
detailed and relevant information about a section or element of the complete fact
table.

Get Solving Operational Business Intelligence with InfoSphere Warehouse Advanced Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.