336 Solving Operational Business Intelligence with InfoSphere Warehouse Advanced Edition
9.1 The value of data and its age
The most valuable data in a data warehouse is data that delivers actionable
insight to the business. In an operational business intelligence (BI) environment,
this is recent or low latency data. The ability to gain actionable insight from low
latency data increases as the volume of low latency data increases. Because the
data sample is greater, more opportunities exist to find value through identifying
patterns, seeing trends, predicting outcomes with greater accuracy, comparing
data sets, and applying data mining techniques.
By effectively managing the data lifecycle, you can help enable greater volumes
of data to be ingested at greater speeds, which delivers an increased return on
investment to your business. The challenge is to maximize the ingest rate. To
meet this challenge, you must have the capability to organize, segregate, and
isolate data that is “cooling off” to help ensure that priority operational queries are
as efficient as possible.
As discussed in Chapter 7, “Understand and address data latency requirements”
on page 267, reducing the time latency between business events occurring and
the ability to take actionable insight based on data related to the event represents
a significant value proposition to your business.
The technical challenge is to store and maintain data to facilitate queries in a way
that maximizes performance and minimizes cost. This can be achieved by
distinguishing data as it ages.
The concept of data temperature and a multi-temperature database, although
unique to each environment, can be generally described by the patterns shown
in Table 9-1.
Table 9-1 Data temperature patterns
Data
temperature
Data temperature
characteristics
Typical data age Data maintenance
Hot Tactical and OLTP-type data;
that is, current data that is
accessed frequently by
queries that must have short
response times, for example,
high volume, small result set
point queries in operational
data stores (ODS).
0 - 3 months and
aggregates or
summaries of this data.
Data is located on the
fastest storage and is
updated frequently.
Frequent table space
backups are taken to aid
fast recovery if needed.
Chapter 9. Managing data lifecyle with InfoSphere Warehouse 337
There are always exceptions when determining the temperature of data and
these exceptions are different in each environment:
򐂰 Active data
Certain date-based events such as the new year might result in data for that
period being accessed more often; this is referred to as active data.
򐂰 Priority workloads
Some queries or applications might have a greater priority than others in your
environment, even though the data accessed might not be the most current
data.
The increasing role of corporate governance and regulations can dictate that
data is retained long after the business use of the data has expired and this need
extends the data lifecycle. Developing a strategy to archive and offline data to
alternative storage points where it can be easily retrieved where required is
crucial.
Warm Traditional decision
support-type data; that is,
data that is accessed less
frequently and by queries that
most likely do not require
short response times.
3 - 13 months and
aggregates or
summaries of this data.
As data ages it is
accessed less frequently
and cools; it is moved to
less expensive “warm”
storage.
The warm data is updated
less frequently, and so less
frequent table space
backups are taken.
Cold Deep historical and legacy
data; that is, data that is
typically accessed
infrequently.
13 months - 5 years. Data is placed on the least
expensive storage
available on the production
system.
Infrequent updates to data
are captured by less
frequent database
incremental and delta
backup operations.
Dormant Regulatory type or archival
data; that is, data that is
accessed infrequently and
that is never updated.
Over 5 years. Data is archived to backup,
file, or to a federated
database on inexpensive
hardware.
Data
temperature
Data temperature
characteristics
Typical data age Data maintenance

Get Solving Operational Business Intelligence with InfoSphere Warehouse Advanced Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.