You are previewing Inside Microsoft® SQL Server® 2008: T-SQL Querying.
O'Reilly logo
Inside Microsoft® SQL Server® 2008: T-SQL Querying

Book Description

Tackle the toughest set-based querying and query tuning problems—guided by an author team with in-depth, inside knowledge of T-SQL. Deepen your understanding of architecture and internals—and gain practical approaches and advanced techniques to optimize your code’s performance.

Discover how to:

  • Move from procedural programming to the language of sets and logic

  • Optimize query tuning with a top-down methodology

  • Assess algorithmic complexity to predict performance

  • Compare data-aggregation techniques, including new grouping sets

  • Manage data modification—insert, delete, update, merge—for performance

  • Write more efficient queries against partitioned tables

  • Work with graphs, trees, hierarchies, and recursive queries

  • Plus—Use pure-logic puzzles to sharpen your problem-solving skills

  • Table of Contents

    1. Inside Microsoft® SQL Server® 2008: T-SQL Querying
    2. Foreword
    3. Acknowledgments
    4. Introduction
      1. Hardware and Software Requirements
      2. Companion Content and Sample Database
      3. Find Additional Content Online
      4. Support for These Books
        1. Questions and Comments
    5. 1. Logical Query Processing
      1. Logical Query Processing Phases
        1. Logical Query Processing Phases in Brief
      2. Sample Query Based on Customers/Orders Scenario
      3. Logical Query Processing Phase Details
        1. Step 1: The FROM Phase
          1. Step 1-J1: Perform Cartesian Product (Cross Join)
          2. Step 1-J2: Apply ON Filter (Join Condition)
          3. Step 1-J3: Add Outer Rows
        2. Step 2: The WHERE Phase
        3. Step 3: The GROUP BY Phase
        4. Step 4: The HAVING Phase
        5. Step 5: The SELECT Phase
          1. Step 5-1: Evaluate Expressions
          2. Step 5-2: Apply the DISTINCT Clause
          3. Step 5-3: Apply the TOP Option
        6. Step 6: The Presentation ORDER BY Phase
      4. Further Aspects of Logical Query Processing
        1. Table Operators
          1. APPLY
          2. PIVOT
          3. UNPIVOT
        2. OVER Clause
        3. Set Operators
      5. Conclusion
    6. 2. Set Theory and Predicate Logic
      1. An Example of English-to-Mathematics Translation
        1. Well-Definedness
          1. Definitions
          2. Undefined Terms
        2. Equality, Identity, and Sameness
        3. Mathematical Conventions
        4. Numbers
        5. Context
          1. Dates
          2. Alphabetical Order
        6. Functions, Parameters, and Variables
        7. Instructions and Algorithms
      2. Set Theory
        1. Notation for Sets
          1. Enumeration
          2. Set-Builder Notation
        2. Well-Definedness of Sets
        3. Domains of Discourse
          1. Domains and Bad Data
          2. Domains and Modeling
        4. Faithfulness
          1. No REAL Faithfulness
        5. Russell’s Paradox
        6. Ordered Pairs, Tuples, and Cartesian Products
          1. Ordered Pairs and k-Tuples
          2. The Cartesian Product
        7. The Empty Set(s)
        8. The Characteristic Function of a Set
        9. Cardinality
          1. Formal and Constructive Definitions of Cardinality
        10. Order
          1. Numerical Order
          2. Alphabetical Order
          3. Trichotomy
          4. Induced Order
          5. Ordinal Numbers
        11. Set Operators
          1. Union and Intersection
          2. Set Difference
        12. Set Partitions
        13. Generalizations of Set Theory
          1. Multiset Theory
      3. Predicate Logic
        1. Logic-Like Features of Programming Languages
          1. The Keyword IF in Control-of-Flow Statements
        2. Propositions and Predicates
          1. Boolean Expressions in T-SQL
          2. Proposition or Predicate?
          3. Creating Propositions from Predicates
        3. The Law of Excluded Middle
        4. And, Or, and Not
          1. What Not Is Not
          2. When And Means Or
          3. Exclusive Or
        5. Logical Equivalence
          1. DeMorgan’s Laws
        6. Logical Implication
          1. If P, Then Q
          2. The Contrapositive
          3. Vacuous Truths
        7. Quantification
          1. Negating Quantified Statements
          2. Multiple Quantification
        8. Alternatives and Generalizations
          1. Boolean Algebra
          2. Three-Valued Logic
          3. Fuzzy Logic
      4. Relations
        1. The Reflexive, Symmetric, and Transitive Properties
          1. Not All < Operators Were Created Equal
      5. A Practical Application
      6. Conclusion
    7. 3. The Relational Model
      1. Introduction to the Relational Model
        1. Relations, Tuples and Types
          1. The Meaning of Relations
          2. Views (and Other Virtual Relations)
          3. Naming Conventions
        2. The Relational Model: A Quick Summary
      2. Relational Algebra and Relational Calculus
        1. Basic Operators
        2. Relational Algebra
          1. Codd’s Eight Original Operators
          2. Additional Relational Algebra Operators
          3. Primitive Relational Algebra Operators
        3. Relational Calculus
        4. T-SQL Support
      3. Data Integrity
        1. Declarative Constraints
          1. Entity Integrity
          2. Referential Integrity
          3. Domain Integrity
        2. Other Means of Enforcing Integrity
          1. The Good, the Bad, and the . . . Unknown!
      4. Normalization and Other Design Topics
        1. Normal Forms Dealing with Functional Dependencies
          1. First Normal Form
          2. Second Normal Form
          3. Third Normal Form
          4. Boyce-Codd Normal Form
        2. Higher Normal Forms
          1. Fourth Normal Form
          2. Fifth Normal Form
          3. Additional Normal Forms
        3. Denormalization
        4. Generalization and Specialization
          1. Principle of Orthogonal Design
      5. Conclusion
    8. 4. Query Tuning
      1. Sample Data for This Chapter
      2. Tuning Methodology
        1. Analyze Waits at the Instance Level
          1. Isolating Top Waits
          2. Collecting Wait Information
        2. Correlate Waits with Queues
        3. Determine Course of Action
        4. Drill Down to the Database/File Level
        5. Drill Down to the Process Level
          1. Trace Performance Workload
          2. Analyze Trace Data
          3. Query Statistics
        6. Tune Indexes and Queries
      3. Tools for Query Tuning
        1. Cached Query Execution Plans
        2. Clearing the Cache
        3. Dynamic Management Objects
        4. STATISTICS IO
        5. Measuring the Run Time of Queries
        6. Analyzing Execution Plans
          1. Graphical Execution Plans
          2. Textual Showplans
          3. XML Showplans
        7. Hints
        8. Traces/Profiler
        9. Database Engine Tuning Advisor
        10. Data Collection and Management Data Warehouse
        11. Using SMO to Clone Statistics
      4. Index Tuning
        1. Table and Index Structures
          1. Pages and Extents
          2. Table Organization
          3. Heap
          4. Clustered Index
          5. Nonclustered Index on a Heap
          6. Nonclustered Index on a Clustered Table
        2. Index Access Methods
          1. Table Scan/Unordered Clustered Index Scan
          2. Unordered Covering Nonclustered Index Scan
          3. Ordered Clustered Index Scan
          4. Ordered Covering Nonclustered Index Scan
          5. The Storage Engine’s Treatment of Scans
            1. Allocation Order Scans vs. Index Order Scans
            2. Allocation Order Scans
            3. Index Order Scans
          6. Nonclustered Index Seek + Ordered Partial Scan + Lookups
          7. Unordered Nonclustered Index Scan + Lookups
          8. Clustered Index Seek + Ordered Partial Scan
          9. Covering Nonclustered Index Seek + Ordered Partial Scan
          10. Index Intersection
          11. Filtered Indexes and Statistics
          12. Indexed Views
        3. Analysis of Indexing Strategies
          1. Table Scan (Unordered Clustered Index Scan)
          2. Unordered Covering Nonclustered Index Scan
          3. Unordered Nonclustered Index Scan + Lookups
          4. Nonclustered Index Seek + Ordered Partial Scan + Lookups
          5. Clustered Index Seek + Ordered Partial Scan
          6. Covering Nonclustered Index Seek + Ordered Partial Scan
          7. Summary of Analysis of Indexing Strategy
        4. Fragmentation
        5. Partitioning
      5. Preparing Sample Data
        1. Data Preparation
        2. TABLESAMPLE
      6. An Examination of Set-Based vs. Iterative/Procedural Approaches and a Tuning Exercise
      7. Conclusion
    9. 5. Algorithms and Complexity
      1. Do You Have a Quarter?
        1. How to Retrieve a Quarter from a Coin Jar
        2. Sometimes the Jar Has No Quarters
      2. How Algorithms Scale
        1. An Example of Quadratic Scaling
        2. An Algorithm with Linear Complexity
        3. Exponential and Superexponential Complexity
          1. The Factorial Function
        4. Sublinear Complexity
          1. Binary Search
        5. Constant Complexity
        6. Technical Definitions of Complexity
          1. Big Oh and related notations
          2. Polynomial and Nonpolynomial Complexity
        7. Comparing Complexities
      3. Classic Algorithms and Algorithmic Strategies
        1. Algorithms for Sorting
          1. Quadratic Sorting Algorithms
            1. Insertion sort
            2. Selection sort
          2. O(n log n) Sorting Algorithms
            1. Merge sort
            2. Quick sort
          3. Faster Sorting Algorithms
            1. Ultra sort
        2. String Searching
      4. A Practical Application
        1. Identifying Trends in Measurement Data
          1. Increasing Subsequences
        2. The Algorithmic Complexity of LISLP
          1. How Many Subsequences Are There?
          2. An Algorithm for LISLP with Θ(n log n) Complexity
        3. Solving the Longest Increasing Subsequence Length Problem in T-SQL
      5. Conclusion
    10. 6. Subqueries, Table Expressions, and Ranking Functions
      1. Subqueries
        1. Self-Contained Subqueries
        2. Correlated Subqueries
          1. Isolating One Row Per Group and Applying a Tiebreaker
          2. EXISTS
            1. EXISTS vs. IN
            2. NOT EXISTS vs. NOT IN
            3. Minimum Missing Value
            4. Reverse Logic Applied to Relational Division Problems
        3. Misbehaving Subqueries
        4. Uncommon Predicates
      2. Table Expressions
        1. Derived Tables
          1. Result Column Aliases
          2. Using Arguments
          3. Nesting
          4. Multiple References
        2. Common Table Expressions
          1. Result Column Aliases
          2. Using Arguments
          3. Multiple CTEs
          4. Multiple References
          5. Modifying Data
          6. CTEs in View and Inline Function Definitions
          7. Recursive CTEs
      3. Analytical Ranking Functions
        1. Row Number
          1. The ROW_NUMBER Function
            1. Determinism
            2. Partitioning
          2. Using Subqueries to Calculate Row Numbers
            1. Unique Sort Column
            2. Nonunique Sort Column and Tiebreaker
            3. Nonunique Sort Column Without a Tiebreaker
            4. Partitioning
          3. Cursor-Based Solution
          4. IDENTITY-Based Solution
            1. Nonpartitioned
            2. Partitioned
          5. Performance Comparisons
          6. Paging
            1. Ad Hoc Paging
            2. Multipage Access
        2. Rank and Dense Rank
          1. RANK and DENSE_RANK Functions
          2. Solutions Based on Subqueries
        3. Tile Number
          1. The Built-in NTILE Function
          2. Other Solutions to Tile Number
      4. Auxiliary Table of Numbers
      5. Missing and Existing Ranges (Also Known as Gaps and Islands)
        1. Missing Ranges (Gaps)
          1. Gaps, Solution 1: Using Subqueries
          2. Gaps, Solution 2: Using Subqueries
          3. Gaps, Solution 3: Using Ranking Functions
          4. Gaps, Solution 4: Using Cursors
          5. Returning Individual Missing Values
        2. Existing Ranges (Islands)
          1. Islands, Solution 1: Using Subqueries and Ranking Calculations
          2. Islands, Solution 2: Using Group Identifier Based on Subqueries
          3. Islands, Solution 3: Using Group Identifier Based on Ranking Calculations
          4. Islands, Solution 4: Using Cursors
          5. A Variation of the Islands Problem
      6. Conclusion
    11. 7. Joins and Set Operations
      1. Joins
        1. Old Style vs. New Style
        2. Fundamental Join Types
          1. CROSS
          2. INNER
          3. OUTER
          4. Nonsupported Join Types
        3. Further Examples of Joins
          1. Self Joins
          2. Multiple Joins
            1. Controlling the Physical Join Evaluation Order
            2. Controlling the Logical Join Evaluation Order
            3. Bushy Plans
          3. Semi Joins
        4. Sliding Total of Previous Year
        5. Join Algorithms
          1. Nested Loops
          2. Merge
          3. Hash
            1. Bitmap Filters in Star Schema Joins
          4. Forcing a Join Strategy
        6. Separating Elements
      2. Set Operations
        1. UNION
          1. UNION DISTINCT
          2. UNION ALL
        2. EXCEPT
          1. EXCEPT DISTINCT
          2. EXCEPT ALL
        3. INTERSECT
          1. INTERSECT DISTINCT
          2. INTERSECT ALL
        4. Precedence of Set Operations
        5. Using INTO with Set Operations
        6. Circumventing Unsupported Logical Phases
      3. Conclusion
    12. 8. Aggregating and Pivoting Data
      1. OVER Clause
      2. Tiebreakers
      3. Running Aggregations
        1. Cumulative Aggregations
        2. Sliding Aggregations
        3. Year-to-Date (YTD)
      4. Pivoting
        1. Pivoting Attributes
        2. Relational Division
        3. Aggregating Data
      5. Unpivoting
      6. Custom Aggregations
        1. Custom Aggregations Using Pivoting
          1. String Concatenation Using Pivoting
          2. Aggregate Product Using Pivoting
        2. User Defined Aggregates (UDA)
        3. Specialized Solutions
          1. Specialized Solution for Aggregate String Concatenation
            1. Dynamic Pivoting
          2. Specialized Solution for Aggregate Product
          3. Specialized Solutions for Aggregate Bitwise Operations
            1. Aggregate Bitwise OR
            2. Aggregate Bitwise AND
            3. Aggregate Bitwise XOR
          4. Median
          5. Mode
      7. Histograms
      8. Grouping Factor
      9. Grouping Sets
        1. Sample Data
        2. The GROUPING SETS Subclause
        3. The CUBE Subclause
        4. The ROLLUP Subclause
        5. Grouping Sets Algebra
          1. Multiplication
          2. Division
          3. Addition
        6. The GROUPING_ID Function
        7. Materialize Grouping Sets
        8. Sorting
      10. Conclusion
    13. 9. TOP and APPLY
      1. SELECT TOP
        1. TOP and Determinism
        2. TOP and Input Expressions
        3. TOP and Modifications
        4. TOP on Steroids
      2. APPLY
      3. Solutions to Common Problems Using TOP and APPLY
        1. TOP n for Each Group
        2. Matching Current and Previous Occurrences
        3. Paging
          1. First Page
          2. Next Page
          3. Previous Page
        4. Random Rows
        5. Median
      4. Logical Transformations
      5. Conclusion
    14. 10. Data Modification
      1. Inserting Data
        1. Enhanced VALUES Clause
        2. SELECT INTO
        3. BULK Rowset Provider
        4. Minimally Logged Operations
          1. Analyzing Logging Behavior
          2. Testing Insert Scenarios
            1. Scenario 1: SELECT INTO, FULL Recovery
            2. Scenario 2: SELECT INTO, Non-FULL Recovery
            3. Scenario 3: INSERT SELECT, Empty Heap, TABLOCK
            4. Scenario 4: INSERT SELECT, Nonempty Heap, TABLOCK
            5. Scenario 5: INSERT SELECT, Empty Heap, Without TABLOCK
            6. Scenario 6: INSERT SELECT, Empty B-Tree, TABLOCK
            7. Scenario 7: INSERT SELECT, Nonempty B-Tree, TABLOCK, TF-610 Off, New Key Range
            8. Scenario 8: INSERT SELECT, Nonempty B-Tree, TABLOCK, TF-610 On, New Key Range
            9. Scenario 9: INSERT SELECT, Nonempty B-Tree, TABLOCK, Merged Key Range
            10. Scenario 10: INSERT SELECT, Empty B-Tree, Without TABLOCK, TF-610 Off
            11. Scenario 11: INSERT SELECT, Empty B-Tree, Without TABLOCK, TF-610 On
            12. Scenario 12: INSERT SELECT, Nonempty B-Tree, without TABLOCK, TF-610 Off, New Key Range
            13. Scenario 13: INSERT SELECT, Nonempty B-Tree, without TABLOCK, TF-610 On, New Key Range
            14. Scenario 14: INSERT SELECT, Nonempty B-Tree, without TABLOCK, Merged Key Range
          3. Summary of Minimal Logging
        5. INSERT EXEC
        6. Sequence Mechanisms
          1. Identity Columns
          2. Custom Sequences
            1. Blocking Sequences
            2. Single Sequence Value
            3. Block of Sequence Values
            4. Nonblocking Sequences
        7. GUIDs
      2. Deleting Data
        1. TRUNCATE vs. DELETE
        2. Removing Rows with Duplicate Data
        3. DELETE Using Joins
      3. Updating Data
        1. UPDATE Using Joins
        2. Updating Large Value Types
        3. SELECT and UPDATE Statement Assignments
          1. Assignment SELECT
          2. Assignment UPDATE
      4. Merging Data
        1. MERGE Fundamentals
        2. Adding a Predicate
        3. Multiple WHEN Clauses
        4. WHEN NOT MATCHED BY SOURCE
        5. MERGE Values
        6. MERGE and Triggers
      5. OUTPUT Clause
        1. INSERT with OUTPUT
        2. DELETE with OUTPUT
        3. UPDATE with OUTPUT
        4. MERGE with OUTPUT
        5. Composable DML
      6. Conclusion
    15. 11. Querying Partitioned Tables
      1. Partitioning in SQL Server
        1. Partitioned Views
          1. Comparing Partitioned Views and Partitioned Tables
        2. Partitioned Tables
          1. Query Plans for Partitioned Tables
          2. Statistics on Partitioned Tables
          3. Partition Elimination
          4. Partitioning and Parallelism
      2. Conclusion
    16. 12. Graphs, Trees, Hierarchies, and Recursive Queries
      1. Terminology
        1. Graphs
        2. Trees
        3. Hierarchies
      2. Scenarios
        1. Employee Organizational Chart
        2. Bill of Materials (BOM)
        3. Road System
      3. Iteration/Recursion
        1. Subordinates
        2. Ancestors
        3. Subgraph/Subtree with Path Enumeration
        4. Sorting
        5. Cycles
      4. Materialized Path
        1. Maintaining Data
          1. Adding Employees Who Manage No One (Leaves)
          2. Moving a Subtree
          3. Removing a Subtree
        2. Querying
      5. Materialized Path with the HIERARCHYID Data Type
        1. Maintaining Data
          1. Adding Employees
          2. Moving a Subtree
        2. Querying
          1. Subtree
          2. Path
          3. Direct Subordinates
          4. Leaf Nodes
          5. Presentation
        3. Further Aspects of Working with HIERARCHYID
          1. Normalizing HIERARCHYID Values
          2. Convert Parent-Child Representation to HIERARCHYID
          3. Sorting Separated Lists of Values
      6. Nested Sets
        1. Assigning Left and Right Values
        2. Querying
      7. Transitive Closure
        1. Directed Acyclic Graph
          1. Undirected Cyclic Graph
      8. Conclusion
    17. A. Logic Puzzles
      1. Puzzles
        1. Puzzle 1: Remainders
        2. Puzzle 2: Round Manhole Covers
        3. Puzzle 3: Shaking Hands
        4. Puzzle 4: Then There Were Five?
        5. Puzzle 5: Arranging Soldiers in a Row
        6. Puzzle 6: Crossing the Tunnel
        7. Puzzle 7: Escaping a Cave
        8. Puzzle 8: Free Tuna
        9. Puzzle 9: Naming an Heir
        10. Puzzle 10: The Next Element in a Series
        11. Puzzle 11: Same Birthday
        12. Puzzle 12: Catching a Train
        13. Puzzle 13: Prisoners and Switches
        14. Puzzle 14: Probabilities in China
        15. Puzzle 15: Two Mathematicians
        16. Puzzle 16: Crazy Sequence
        17. Puzzle 17: Minimum Number of Weights
        18. Puzzle 18: Counting Triangles
        19. Puzzle 19: Counterfeit Coins
        20. Puzzle 20: Too Clever by Half
        21. Puzzle 21: A Cat, a String, and the Earth
        22. Puzzle 22: Josephus Problem
        23. Puzzle 23: Shipping Algebra
        24. Puzzle 24: Equilateral Triangles Puzzle
      2. Puzzle Solutions
        1. Puzzle 1: Remainders
        2. Puzzle 2: Round Manhole Covers
        3. Puzzle 3: Shaking Hands
        4. Puzzle 4: Then There Were Five?
        5. Puzzle 5: Arranging Soldiers in a Row
        6. Puzzle 6: Crossing the Tunnel
        7. Puzzle 7: Escaping a Cave
        8. Puzzle 8: Free Tuna
        9. Puzzle 9: Naming an Heir
        10. Puzzle 10: The Next Element in a Series
        11. Puzzle 11: Same Birthday
        12. Puzzle 12: Catching a Train
        13. Puzzle 13: Prisoners and Switches
        14. Puzzle 14: Probabilities in China
        15. Puzzle 15: Two Mathematicians
        16. Puzzle 16: Crazy Sequence
        17. Puzzle 17: Minimum Number of Weights
        18. Puzzle 18: Counting Triangles
        19. Puzzle 19: Counterfeit Coins
        20. Puzzle 20: Too Clever by Half
        21. Puzzle 21: A Cat, a String, and the Earth
        22. Puzzle 22: Josephus Problem
        23. Puzzle 23: Shipping Algebra
        24. Puzzle 24: Equilateral Triangles Puzzle
      3. Conclusion
    18. B. About the Authors
      1. Itzik Ben-Gan
      2. Lubor Kollar
      3. Dejan Sarka
      4. Steve Kass
    19. Index
    20. About the Authors
    21. Copyright