82 Solving Operational Business Intelligence with InfoSphere Warehouse Advanced Edition
4.1 Start with the business problem
In a true business intelligence environment, the objective of a data warehouse is
not to provide the means of holding large amounts of data, or even to provide an
efficient platform for processing queries. These are, in reality, the requirements
needed by a data warehouse to achieve its main goal of providing the means to
answer questions such as the following about your business:
򐂰 Did our business grow from last quarter or last year?
򐂰 Will we have sufficient stock in key regions?
򐂰 Which stores are making a profit or have a loss?
It is these kinds of questions that are important to the business. When you start
to plan your data warehouse solution, they become the first step in your
modeling process.
4.1.1 Online analytical processing
Normally, any company will contain at least one, if not many, operational
production databases that are used in the running of the core business.
Traditionally, these types of relational databases are implemented as part of a
transaction processing application. For this reason, they are known as online
transaction processing (OLTP) databases.
Such databases are aimed at high volume, high velocity, and low complexity
transaction workloads, including those generated during a sales transaction
either at a shop till or an online shopping site. The details of what items
customers purchased, whether they used cash or credit cards, along with what
time the transactions took place are all stored within the OLTP database.
In these cases, the database design will be focused on a single or low number of
table queries and updates. A database design best suited to this type of
environment is based on an entity relationship schema, and is typically modeled
to a level of normalization called third normal form (3NF). This type of model can
be implemented within IBM database development tool called InfoSphere Data
Architect, as one of the following data models:
򐂰 A logical data model, which is a model that represents the entity or table
structures needed to perform a business operation
򐂰 A physical data model, which is a simulation of the tables and other database
objects that constitute a real database
Chapter 4. Data modeling: End to end 83
Figure 4-1 illustrates both a logical model and physical model within InfoSphere
Data Architect.
Figure 4-1 Logical and physical models in InfoSphere Data Architect
By its nature, a database implemented for online transaction processing is
inefficient when used in performing the types of analytical queries required in
answering the range of questions demanded by the company’s business
analysts. The reason is because these analysts, in performing their work of
producing business intelligence reports, must create highly complex,
multidimensional queries that often must engage with large quantities of
historical data.
Physical Model
in
Editor
Logical Model
in Editor
Physical Model
in Project
Explorer
Logical Model
in Project
Explorer
Levels of normalization: For those not familiar with normal form (NF), the full
list is:
򐂰 1NF - No repeating elements or groups of elements within a table.
򐂰 2NF - No partial dependencies on a concatenated key within a table.
򐂰 3NF - No dependencies on non-key attributes within a table
84 Solving Operational Business Intelligence with InfoSphere Warehouse Advanced Edition
However, a database designed to handle OLTP transactions is developed to
process short, simple queries. Thus, it will be inefficient when handling these
complex queries, often resulting in many calls to the database to return the
required information. Normally termed online analytical processing (OLAP),
these types of complex queries therefore require a completely different schema
structure to that implemented for OLTP.
A solution to this problem can be found in a format called a
star schema. A star
schema contains a structure where the data is divided into both Fact and
Dimensional tables as shown in Figure 4-2.
This star schema allows complex queries to be processed with a far smaller
number of interactions with the database. For an efficient OLAP solution,
develop each star schema to represent a single business process that is at the
center of your related business questions.
Figure 4-2 Tables organized as a star schema

Get Solving Operational Business Intelligence with InfoSphere Warehouse Advanced Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.