7.5. Raw Files

Raw files are proprietary sources and destinations in the SSIS data flow that are only accessible through SSIS. The files are binary files that contain all the metadata information for the file (data types and column names) in the file header. Because this file is proprietary to SSIS, it's the fastest source and destination. When you must stage data, it's an excellent choice because of its speed. It's also great from a reliability perspective, since you may use these sources and destinations in a complex data flow to capture images of your data at any point in time. For example, if you have a four-hour data flow process, you may want to stage the load at different points in time for recovery reasons; breaking up the extract and transformation from the load. This is common in a dimension load, where you may have to massage the data to be loaded into the dimension before applying the slowly changing dimension logic.

Another use for raw files is to break up a mainframe extract into more practical files. Oftentimes, you will receive files from a vendor or mainframe group as one unified file, when it should have been separated into individual files. For example, imagine the extract file shown in the following table.

RecordTypeOrderIDItemQuantityPrice
11   
12   
21Soap12.24
21Firewood24.5
13   
22Pepper11.58
23Soap12.24
23Cola54.5

In this file, you can see there is an order entry for RecordType 1 and the details are in RecordType 2. This file would generally be a fixed-width ...

Get Expert SQL Server™ 2005 Integration Services now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.