92 Solving Operational Business Intelligence with InfoSphere Warehouse Advanced Edition
򐂰 Nearly all dimensional models have one or more date and time dimensions.
These types of dimensions must be handled differently from other dimensions
such as geography or products. This is due to the additonal attributes that can
be applied to date and time dimensions.
򐂰 If a dimension table includes a code then, where possible, include the code
description as well. For example, if a product is identified by a product code,
and each code represents a product name, include both the code and the
product name in the dimension table.
򐂰 Normally, avoid having dimensional models include more than 10-15
dimension tables. If your model has more, look for dimension tables that can
be merged.
򐂰 Include the primary keys (surrogate keys) of the dimension tables in the fact
table as foreign keys. In this context, the surrogate keys are a unique
identifier for each row within a table.
If a unique key or identifier is required for each row within a table, and the
application data in the table does not provide within a single column, a new
column containing a derived or calculated unique identifier value is added to
the table. For performance reasons, the data type of this column is to be a
numeric of type INTEGER.
򐂰 A dimension table may have one or more dummy or unrelated records. This is
because a foreign key in a fact table can never be null; that by definition
violates referential integrity.
򐂰 The rows in a dimension table establish a one-to-many relationship with the
fact table. For example, there might be a number of sales to a single
customer, or a number of sales of a single product.
򐂰 The dimension table contains attributes associated with the dimension entry;
these attributes are rich and business-oriented textual details, such as
product name or customer name. These attributes serve as report labels and
query constraints.
򐂰 Have each dimension attribute aim to take on a single value in the context of
each measurement inside the fact table.
It is worth noting that the quality of a good dimensional model is directly related
to the quality of attributes present inside these dimension tables. The dimension
table attributes appear as report labels within the business reports generated by
BI tools such as IBM Cognos.
4.2.4 Identify the fact tables
The final process of developing a dimensional model is to identify tables within
the 3NF model that have the potential to become the required fact table at the
Chapter 4. Data modeling: End to end 93
center of the star schema. Different from the dimensional tables, the fact table
must align to a business process or action and provide the necessary means to
answer questions related to this process or action. Similar to the process of
de-normalizing a dimensional table, the final fact table might have to be a
combination of one or multiple tables from the 3NF model.
A fact table also must contain, along with the links to the dimensional tables, a
set of columns that provide the data on which to measure the business process
on which the fact table is based. These data fields are often called the “facts” or
“measures” of the fact table.
Consider the example of modeling a sales transaction. The fact table might
represent the business process of a customer buying a set of goods. In this case,
the number of items sold, either individually or as a whole, become the set of
measures or facts. Thus, in answer to the business questions of “How many of
product Y did a customer buy?”, you can use the measures within the fact table
to answer “They bought X amount of product Y.
94 Solving Operational Business Intelligence with InfoSphere Warehouse Advanced Edition
Figure 4-7 illustrates an example fact table within a dimensional model.
Figure 4-7 A fact table within a dimensional model
When developing a fact table, keep the following points in mind, because they
have considerations for the implementation of the model:
򐂰 For best performance, use an INTEGER data type for the link columns that
are used to join the fact table to the dimensions tables. Use a surrogate key if
the natural key is of a different data type.
򐂰 Link columns are created as NOT NULL to ensure that the connections to the
dimensions tables can be performed during queries.
Fact Table to
Dimensional Table
Key relationship
Fact Table
measures
Dimensional
Table attributes
Dimensional
Table
Fact Table

Get Solving Operational Business Intelligence with InfoSphere Warehouse Advanced Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.