You are previewing Microsoft® SQL Server® 2012 Integration Services.
O'Reilly logo
Microsoft® SQL Server® 2012 Integration Services

Book Description

Build and manage data integration solutions with expert guidance from the Microsoft SQL Server Integration Services (SSIS) team. See best practices in action and dive deep into the SSIS engine, SSISDB catalog, and security features. Using the developer enhancements in SQL Server 2012 and the flexible SSIS toolset, you’ll handle complex data integration scenarios more efficiently—and acquire the skills you need to build comprehensive solutions. Discover how to:

  • Use SSIS to extract, transform, and load data from multiple data sources

  • Apply best practices to optimize package and project configuration and deployment

  • Manage security settings in the SSISDB catalog and control package access

  • Work with SSIS data quality features to profile, cleanse, and increase reliability

  • Monitor, troubleshoot, and tune SSIS solutions with advanced features such as detailed views and data taps

  • Load data incrementally to capture an easily consumable stream of insert, update, and delete activity

  • Table of Contents

    1. Microsoft® SQL Server® 2012 Integration Services
    2. Dedication
    3. A Note Regarding Supplemental Files
    4. Foreword
    5. Introduction
      1. Who Should Read This Book
        1. Assumptions
      2. Who Should Not Read This Book
      3. Organization of This Book
        1. Finding Your Best Starting Point in This Book
      4. Conventions and Features in This Book
      5. System Requirements
      6. Code Samples
        1. Installing the Code Samples
        2. Using the Code Samples
      7. Acknowledgments
      8. Errata & Book Support
      9. We Want to Hear from You
      10. Stay in Touch
    6. I. Overview
      1. 1. SSIS Overview
        1. Common Usage Scenarios for SSIS
          1. Consolidation of Data from Heterogeneous Data Sources
            1. Data Type Support
            2. Metadata Exposed by Provider
            3. Support in 32 and 64-Bit Environments
            4. Performance
          2. Movement of Data Between Systems
          3. Loading a Data Warehouse
          4. Cleaning, Formatting, or Standardization of Data
          5. Identification, Capture, and Processing of Data Changes
          6. Coordination of Data Maintenance, Processing, or Analysis
        2. Evolution of SSIS
        3. Setting Up SSIS
          1. SQL Server Features Needed for Data Integration
            1. Integration Services
            2. SQL Server Data Tools
            3. Database Engine Services
            4. Management Tools
          2. SQL Server Editions and Integration Services Features
        4. Summary
      2. 2. Understanding SSIS Concepts
        1. Control Flow
          1. Tasks
          2. Precedence Constraints
          3. Variables and Expressions
          4. Containers
            1. Foreach Loop Container
            2. For Loop Container
            3. Sequence Container
          5. Connection Managers
          6. Packages and Projects
          7. Parameters
          8. Log Providers
          9. Event Handlers
        2. Data Flow
          1. Source Adapters
          2. Destination Adapters
          3. Transforms
        3. SSIS Catalog
          1. Overview
          2. Catalog
          3. Folders
          4. Environments
          5. References
        4. Summary
      3. 3. Upgrading to SSIS 2012
        1. What’s New in SSIS 2012
        2. Upgrade Considerations and Planning
          1. Feature Changes in SSIS
            1. Support for Data Transformation Services
            2. ActiveX Script Task
            3. Data Viewers
          2. Dependencies and Tools
          3. Upgrade Requirements
            1. Software Requirements
            2. Hardware Requirements
            3. Upgrade Duration
          4. Upgrade Scenarios
            1. Version Upgrade Paths
            2. Upgrade in Side-by-Side Installation
            3. Edition Upgrade
          5. Unsupported Upgrade Scenarios
          6. Upgrade Validation
        3. Integration Services Upgrade
          1. Upgrade Advisor
          2. Performing Upgrade
            1. Upgrading Installed Files and Objects
            2. Upgrading SSIS Projects and Packages
          3. Addressing Upgrade Issues and Manual Upgrade Steps
          4. Conversion to Projects after Upgrade
            1. Project Conversion Wizard Launch Options
            2. Command Line, Start Menu, or SQL Server Management Studio
            3. SQL Server Data Tools in the Context of an Opened Project
        4. Summary
    7. II. Development
      1. 4. New SSIS Designer Features
        1. The Integration Services Designer
          1. Visual Studio
          2. Undo and Redo
          3. Getting Started Window
          4. Toolbox
          5. Variables Window
          6. Zoom Control
          7. Autosave and Recovery
          8. Status Icons
          9. Annotations
        2. Configuration and Deployment
          1. Solution Explorer Changes
            1. Project Parameters
            2. Connection Managers
          2. Parameter Tab
          3. Visual Studio Configurations
          4. Project Compilation
          5. Deployment Wizard
          6. Project Conversion Wizard
          7. Import Project Wizard
        3. New Tasks and Data Flow Components
          1. Change Data Capture
            1. CDC Control Task
            2. CDC Source
            3. CDC Splitter
          2. Expression Task
          3. DQS Cleansing Transform
          4. ODBC Source and Destination
        4. Control Flow
          1. Expression Adorners
          2. Connection Managers
          3. Execute SQL Task
        5. Data Flow
          1. Connection Assistants
          2. Improved Column Mapping
          3. Editing Components in an Error State
          4. Grouping
          5. Simplified Data Viewers
          6. Row Count and Pivot Transform User Interfaces
          7. Flat File Source Changes
            1. Ragged-Right Delimited Files
            2. Embedded Qualifiers
        6. Scripting
          1. Visual Studio Tools for Applications
          2. Script Component Debugging
          3. .NET 4 Framework Support
        7. Expressions
          1. Removal of the Character Limit
          2. New Expression Functions
            1. LEFT Function
            2. REPLACENULL Function
            3. TOKEN Function
            4. TOKENCOUNT Function
        8. Summary
      2. 5. Team Development
        1. Improvements in SQL Server 2012
          1. Package Format Changes
          2. Visual Studio Configurations
        2. Using Source Control Management with SSIS
          1. Connecting to Team Foundation Server
          2. Adding an SSIS Project to Team Foundation Server
          3. Change Management
          4. Changes to the SSIS Visual Studio Project File
        3. Best Practices
          1. Using Small, Simple Packages
          2. One Developer Per Package
          3. Consistent Naming Conventions
        4. Summary
      3. 6. Developing an SSIS Solution
        1. SSIS Project Deployment Models
          1. Package Deployment Model
            1. Converting from Package to Project Deployment Model
          2. Project Deployment Model
            1. Converting from Project to Package Deployment Model
        2. Develop an Integration Services Project
          1. Creating an SSIS Project
          2. Designing an Integration Services Data Flow
          3. Using Parameters and the ForEach Container
          4. Using the Execute Package Task
          5. Building and Deploying an Integration Services Project
        3. Summary
      4. 7. Understanding SSIS Connectivity
        1. Previous Connectivity Options in SSIS
          1. Providers for Connectivity Technology
          2. OLE DB, ADO.NET, and ODBC
            1. Task and Transform Supportability
            2. Performance
            3. Database Vendor Support
        2. New Connectivity Options in SSIS 2012
          1. Introducing ODBC
            1. ODBC and Database Vendors
            2. ODBC Drivers
            3. ODBC Tracing
          2. ODBC Components for SSIS
          3. ODBC Source
            1. Basic Setup of ODBC Source
            2. Bulk Fetch Logic
            3. Advanced ODBC Source
          4. ODBC Destination
            1. Basic Setup of ODBC Destination
            2. Bulk Insert Logic
            3. Advanced ODBC Destination
        3. Connectivity Considerations for SSIS
          1. 64-Bit and SSIS
            1. 64-Bit Processors and WOW64
          2. SSIS Tools on 64-Bit Architecture
            1. SQL Server Data Tools
            2. SSIS Catalog and Package Execution Tools
        4. Connectivity to Other Sources and Destinations
          1. Connecting to Microsoft Excel and Access
            1. Jet Providers
            2. ACE Providers
            3. Other Considerations for ACE and Jet
          2. Connecting to Oracle
            1. Microsoft OLE DB and .NET Providers for Oracle
            2. Oracle OLE DB Provider
            3. ODBC Connections
            4. Attunity Connectors for Oracle
          3. Creating Custom Components
          4. Using Script Components
        5. Summary
      5. 8. Working with Change Data Capture in SSIS 2012
        1. CDC in SQL Server
          1. Using CDC in SQL Server
          2. CDC Scenarios in ETLs
          3. Stages in CDC
            1. Initial Data Loading
            2. Syncing CDC Changes
        2. CDC in SSIS 2012
          1. CDC State
            1. CDC State String
            2. Types of CDC States
            3. Storing a CDC State
          2. CDC Control Task
            1. Configuring a CDC Control Task
            2. CDC Control Task Operations
          3. Data Flow Component: CDC Source
            1. Configuring a CDC Source Component
            2. Net vs. All Changes
          4. CDC Splitter Component
            1. Types of Changes
            2. Using the Splitter Component
        3. CDC for Oracle
          1. Introduction
          2. Components for Creating CDC for Oracle
          3. CDC Service Configuration MMC
            1. Installation
            2. Using the CDC Service Configuration MMC
            3. What Happens When a Service Is Created?
          4. Oracle CDC Designer MMC
            1. Installation
            2. Oracle CDC Instance
            3. Using the CDC Designer
              1. Create Oracle CDC Instance Wizard
          5. MSXDBCDC Database
            1. Tables in MSXDBCDC
            2. Stored Procedures in MSXDBCDC
          6. Oracle CDC Service Executable (xdbcdcsvc.exe)
            1. Installation
            2. Commands
              1. Config command
              2. Create command
              3. Delete command
          7. Data Type Handling
          8. SSIS CDC Components
        4. Summary
      6. 9. Data Cleansing Using SSIS
        1. Data Profiling Task
        2. Fuzzy Lookup Transformation
        3. Fuzzy Grouping Transformation
        4. Data Quality Services Cleansing Transform
        5. Summary
    8. III. Database Admin
      1. 10. Configuration in SSIS
        1. Configuration Basics
          1. How Configurations Are Applied
            1. Property Path
          2. What to Configure
          3. Changes in SSIS 2012
        2. Configuration in SSIS 2012
          1. Parameters
            1. Project Parameters
            2. Package Parameters
          2. Creating Package Parameters
          3. Creating Project Parameters
            1. More About Project.params
          4. API for Creating Parameters
          5. Using Parameters
            1. Parameters in Expressions and Property Paths
            2. Parameters and Execute Package Tasks
            3. Use of Project Parameters
            4. Differences between Parameters and Variables
        3. Configuring Parameters on the SSIS Catalog
          1. Configuring, Validating, and Executing Packages and Projects
          2. Configuration Through SSMS
            1. Configuring and Validating Packages
            2. Executing with Configured Values
          3. Configuration Using SQL Agent, DTExec, and T-SQL
            1. SQL Agent and Configuration
            2. DTExec and Configuration
            3. Configuring Using T-SQL Stored Procedures
          4. SSIS Environments
            1. Relative and Absolute References
          5. Evaluation Order of Parameters
        4. Package Deployment Model and Backward Compatibility
          1. Package Deployment Model
            1. Limitations of the Package Deployment Model
            2. Executing Packages Using the Package Deployment Model
        5. Best Practices for Configuring SSIS
          1. Best Practices with Package Deployment Model
            1. Package Templates
            2. SQL Server Configurations
            3. Indirect Configurations Using Environment Variables
            4. Overriding Configurations
          2. Best Practices with Project Deployment Model
            1. Project Parameters and Project Connection Managers
        6. Summary
      2. 11. Running SSIS Packages
        1. Ways to Run SSIS Packages
          1. Package Locations
            1. File System
            2. MSDB Database
            3. Package Store
            4. SSIS Catalog
          2. Configuring Packages
            1. SET
            2. ConfigFile
            3. Parameter
          3. Error Dumps
          4. Logging Options
            1. ConsoleLog
            2. Logger
            3. VLog
        2. Running Packages in the SSIS Catalog
          1. Prepare Executions
          2. Starting SSIS Package Executions
          3. View Executions
          4. Executions with T-SQL
        3. Running Packages from SQL Agent
          1. Create an SSIS Job Step
          2. Execute Packages from the SSIS Catalog
        4. Running Packages via PowerShell
        5. Creating and Running SSIS Packages Programmatically
        6. Summary
      3. 12. SSIS T-SQL Magic
        1. Overview of SSIS Stored Procedures and Views
        2. Integration Services Catalog
          1. SSIS Catalog Properties
          2. Querying the SSIS Catalog Properties
          3. Setting SSIS Catalog Properties
        3. SSIS Projects and Packages
          1. Deploy an SSIS Project to the SSIS Catalog
          2. Learning About the SSIS Projects Deployed to the SSIS Catalog
          3. Configuring SSIS Projects
            1. Learning About the Parameters for a Project
            2. Setting the Value of a Parameter
          4. Managing SSIS Projects in the SSIS Catalog
            1. Deleting SSIS Projects
            2. Moving SSIS Projects
            3. Restoring an SSIS Project to an Earlier Version
          5. Running SSIS Packages in the SSIS Catalog
            1. Starting and Stopping SSIS Package Execution
            2. Property Overrides
            3. Discovering Package Execution Details
            4. Setting the Logging Level for a Package Execution
            5. Asynchronous vs. Synchronous Executions
        4. SSIS Environments
          1. Creating SSIS Environments
          2. Creating SSIS Environment Variables
          3. Configuring SSIS Projects Using SSIS Environments
            1. Creating Project References to SSIS Environments
          4. Configuring SSIS Projects Using Reference Values
          5. Package Execution Using SSIS Environments
          6. Managing SSIS Environment and Environment Variables
        5. Summary
      4. 13. SSIS PowerShell Magic
        1. PowerShell Refresher
        2. PowerShell and SQL Server
        3. Managing SSIS with PowerShell
          1. SSIS Management Object Model
          2. PowerShell with SSIS Management Object Model
            1. Creating or Dropping an SSIS Catalog
            2. Creating and Dropping Folders
            3. Deploying and Managing Projects
            4. Create Environments and Environment Variables
            5. Creating Package Executions with Parameters and References
          3. PowerShell and SSIS Using T-SQL
        4. Advantages of Using PowerShell with SSIS
        5. Summary
      5. 14. SSIS Reports
        1. Getting Started with SSIS Reports
        2. Data Preparation
        3. Monitoring SSIS Package Execution
          1. Integration Services Dashboard
          2. All Executions Report
          3. All Validations and All Operations Reports
        4. Using SSIS Reports to Troubleshoot SSIS Package Execution
        5. Using the Execution Performance Report to Identify Performance Trends
        6. Summary
    9. IV. Deep-Dive
      1. 15. SSIS Engine Deep Dive
        1. The Control Flow Engine
          1. Overview
          2. Load
            1. Read XML
            2. Decrypt and Check Signature
            3. Check Package Format Version
            4. Load Objects
            5. Apply Configurations and Expressions
          3. Apply Parameters
          4. Validate
            1. Delaying Validation
            2. Connection Manager Validation
          5. Execute
            1. Scheduling Algorithm
            2. Executable Types
            3. Threading
            4. Events
        2. The Data Flow Engine
          1. Overview
            1. Layout Management
            2. Buffer Manager
          2. Execution Control
            1. Component Interfaces
            2. Output Types
            3. Execution Paths
            4. Execution Plans
            5. Execution
          3. Backpressure
            1. Basic Backpressure
            2. Enhanced Backpressure
          4. Engine Tuning
            1. Buffer Size
            2. EngineThreads
            3. RunInOptimizedMode
            4. Split Complex Components
        3. Summary
      2. 16. SSIS Catalog Deep Dive
        1. SSIS Catalog Deep Dive
          1. Creating the SSIS Catalog
          2. Unit of Deployment to the SSIS Catalog
          3. What Is Inside SSISDB?
          4. SQL Server Instance Starts Up
          5. SSIS Catalog and Logging Levels
        2. Understanding the SSIS Package Execution Life Cycle
          1. Stopping SSIS Package Executions
          2. Using the Windows Application Event Log
        3. SSIS Catalog Maintenance and SQL Server Agent Jobs
        4. Backup and Restore of the SSIS Catalog
          1. Back Up SSISDB
          2. Restore SSISDB
        5. Summary
      3. 17. SSIS Security
        1. Protect Your Package
          1. Control Package Access
            1. File System
            2. MSDB Database
            3. Package Store
          2. Package Encryption
          3. Sensitive Variables and Parameters
          4. Package Signing
        2. Security in the SSIS Catalog
          1. Security Overview
            1. User and Roles
            2. Schemas
            3. Securables and Permissions
          2. Manage Permissions
            1. Permission Initialization
            2. Permission Enforcement
            3. Grant and Revoke Permissions
            4. Permission Inheritance
            5. Deny Permissions
          3. DDL Trigger
        3. Running SSIS with SQL Agent
          1. Requirements
          2. Create Credentials
          3. Create Proxy Accounts
          4. Create SQL Agent Jobs
        4. Summary
      4. 18. Understanding SSIS Logging
        1. Configure Logging Options
          1. Choose Containers
          2. Select Events
          3. Add Log Providers
        2. Log Providers
          1. Text Files
          2. SQL Server
          3. SQL Server Profiler
          4. Windows Event Log
          5. XML Files
        3. Logging in the SSIS Catalog
          1. Logging Levels
          2. Event Logs
          3. Event Context Information
        4. Advanced Logging Topics
          1. Customizing Logging Fields
          2. Logging with dtexec Utility
          3. Developing Custom Log Providers
            1. Building Core Functionality
            2. Registering the New Provider
        5. Summary
      5. 19. Automating SSIS
        1. Introduction to SSIS Automation
          1. Programmatic Generation of SSIS Packages
            1. Creating Packages in Bulk
            2. Applying Consistent Patterns
          2. Metadata-Driven Package Execution
        2. Dynamic Package Generation
          1. Handling Design-Time Events
          2. Samples
            1. Creating, Saving, and Loading a Package
            2. Working with Projects
            3. Creating a Data Flow
            4. Inserting a Task from Another Package
            5. Using an Execute Package Task
        3. Metadata-Based Execution
          1. Custom Package Runner
          2. Using PowerShell with the SSIS Management Object Model
          3. Using PowerShell with SQL Agent
        4. Alternative Solutions and Samples
          1. Samples on Codeplex
            1. EzAPI
            2. Package Generation Sample
            3. MDDE Studio
            4. Vulcan
          2. Third-Party Solutions
            1. BIDS Helper
            2. Varigence Mist
            3. Pragmatic Works BI xPress
            4. CozyRoc
        5. Summary
    10. V. Troubleshooting
      1. 20. Troubleshooting SSIS Package Failures
        1. Getting Started with Troubleshooting
        2. Data Preparation
        3. Troubleshooting Failures of SSIS Package Executions
          1. Three Key Steps Toward Troubleshooting Failures of SSIS Package Executions
            1. Step S: Finding Out the Status of Package Execution
            2. Step V: Finding Out the Values Specified for the Parameters
            3. Step T: Finding Out the Tasks and Containers That Failed
          2. Execution Path
          3. Finding the Root Cause of Failure
            1. Context of an Error
        4. Troubleshooting the Execute Package Task and Child Package Executions
          1. DiagnosticEx Events
          2. Execute Package Task and Execution Path
        5. Troubleshooting SSIS Package Execution Failures Scheduled with SQL Agent
          1. Using CallerInfo to Determine SSIS Package Executions That Are Executed by SQL Agent
          2. Using SQL Agent History Tables to Determine the SSIS Job Steps That Failed
        6. Summary
      2. 21. SSIS Performance Best Practices
        1. Creating a Performance Strategy
          1. OVAL Technique
        2. Measuring SSIS Performance
          1. Measuring System Performance
            1. Buffers Spooled
            2. Other Useful Counters
            3. SSIS Catalog Counters
          2. Measuring Performance of Data Flow Tasks
            1. Instrumenting the Package
            2. Measuring Performance of the Source Component
            3. Measuring Performance of the Transformation Component
            4. Measuring Performance of Destination Components
            5. Analysis of a Data Flow Task
        3. Designing for Performance
          1. Parallelize Your Design
            1. Parallelize Solution Architecture
            2. Parallelizing Using Precedence Constraints
          2. Using SQL Server Optimization Techniques
            1. Staging Data
            2. Replacing OLE DB Commands with Staging on SQL Server
            3. Replacing Suboptimal Aggregation SSIS Operations
            4. Using SQL Server Partitioning
          3. Bulk Loading Your Data
            1. Bulk Loading Using Partitions
          4. Keeping SSIS Operations in Memory
          5. Optimizing SSIS Lookup Caching
            1. Lookup and Cache Transforms Primer
            2. Cache Mode Comparisons
            3. Storing Cache Misses with Partial Cache Mode
            4. Sharing a Cache
          6. Optimizing SSIS Infrastructure
            1. SSIS Catalog
            2. Choosing the Right Connectors
        4. Summary
      3. 22. Troubleshooting SSIS Performance Issues
        1. Performance Profiling
        2. Troubleshooting Performance Issues
          1. Data Preparation
        3. Understanding SSIS Package Execution Performance
          1. SSIS Package Execution Duration
          2. Time Spent at Each Task in the SSIS Package
          3. Time Spent at Each Phase of the Data Flow Component
          4. Elapsed Time for Data Flow Component Phases (Active Time vs. Total Time)
        4. Monitoring SSIS Package Execution Performance
        5. Per-Execution Performance Counters
        6. Interactive Analysis of Performance Data
        7. Summary
      4. 23. Troubleshooting Data Issues
        1. Troubleshooting in the Design Environment
          1. Row Count Values
          2. Data Viewers
          3. Data in Error Output
          4. Breakpoints and Debug Windows
        2. Troubleshooting in the Execution Environment
          1. Execution Data Statistics
          2. Data Tap
            1. Add Data Taps by Name
            2. Add Data Taps by GUID
          3. Error Dumps
        3. Summary
    11. Index
    12. About the Authors
    13. Copyright