Solving Operational Business Intelligence with InfoSphere Warehouse Advanced Edition

348 Solving Operational Business Intelligence with InfoSphere Warehouse Advanced Edition

Here, we build on the concepts introduced earlier and describe how Optim High

Performance Unload can be used to archive data and migrate data from one

database to another.

When data is no longer needed in the data warehouse, or where the cost of

storage and maintenance has exceeded the value to be gained from the data, it

is advisable to archive the data. In archiving data, it is important that you have a

clear and simple method for reinstating the data if required.

Maintaining database or table space backup images, tapes and history can be

prone to error. IBM Optim High Performance Unload can be used to quickly and

effectively unload the data to be archived to a file or files. When completed, the

files can be retained and the database object removed, which frees space for

warmer data.

9.4.1 Archive cold data

Data archiving is discussed in the context of archiving data out of the data

warehouse. The target destination can be a file, backup image, or a federated

server if data access to the archived data is required.

The goals when archiving data are to have a minimal effect on the production

server, and to be able to realize the storage capacity made available as a result

of the archiving process.

By using storage groups in conjunction with table partitioning and table space

usage as described earlier, a data partition can be detached without affecting the

data availability of the remaining data partitions in the table. After it is detached,

the data partition can be dropped along with the associated table space. Data in

table spaces can then be asynchronously migrated from other storage groups.

Data can be archived in many ways. The traditional method is to perform and

keep a full database or table space backup. An alternate method is to use Optim

High Performance Unload to unload data for the individual data partition and

store the data in flat files for future reference or for migration to another

database.

The following example shows how to use the Optim High Performance Unload

utility to unload data from a table space container for a single data partition.

Chapter 9. Managing data lifecyle with InfoSphere Warehouse 349

The following steps were completed in our test environment to archive and drop

a data partition:

1. Detach the data partition from the partitioned table.

Detaching the data partition from the range partitioned table reverts the data

partition to a separate regular table with the new table name you specify.

Example 9-4 shows the statement to detach data partition PART_2009Q1 into

table ARCHIVE_PART_2009Q1. The new table remains in the same table

space.

Example 9-4 Detach a data partition

-- Detach a data partition from a partitioned table

ALTER TABLE CSTINSIGHT.CUSTOMER_TXN DETACH PARTITION part_2009q1 INTO TABLE

archive_part_2009q1

2. Create the Optim High Performance Unload control file.

Create the control file required to unload data from the table space container

to a single file on disk.

Example 9-5 Sample control file used to unload data

-- archive_part_2009q4.ctl

GLOBAL CONNECT TO cstinsdb

DB2 NO LOCK NO QUIESCE NO;

UNLOAD TABLESPACE

SELECT * FROM cstinsight.archive_part_2009q4

;

output("/stage/filename.del")

format del;

Where:

– The directory /stage is accessible on all data nodes.

– QUIESCE NO is sufficient here because the DETACH process is a system

catalog operation and not a physical data move.

– /stage/filename.del represents a single file for data from all database

partitions.

3. Execute Optim High Performance Unload from the command line as shown in

Example 9-6.

Example 9-6 Issuing db2hpu from the command line with control file

db2hpu -i db2inst1 -f archive_part_2009q4.ctl

Get Solving Operational Business Intelligence with InfoSphere Warehouse Advanced Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Solving Operational Business Intelligence with InfoSphere Warehouse Advanced Edition by Whei-Jen Chen, Pat Bates, Timothy Donovan, Garrett Fitzsimons, Jon Lind, Rogerio Silva

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly