You are previewing The Brilliance of Netezza.
O'Reilly logo
The Brilliance of Netezza

Book Description

The Brilliance of Netezza will make all readers instant experts on the architecture of Netezza. With its clever design and “secret sauce” of FPGA Cards and Zone Maps, Netezza is quickly becoming a staple of all major enterprise data warehouses. This is a perfect opportunity to make yourself an invaluable asset to your company by obtaining the kind of knowledge that this book contains.

Table of Contents

  1. Cover
  2. Title Page
  3. The Tera-Tom Genius Series
  4. Tera-Tom- Author of over 50 Books
  5. The Best Query Tool Works on all Systems
  6. Trademarks and Copyrights
  7. About Tom Coffing
  8. About John Nolan
  9. Table of Contents
  10. Chapter 1 – How Netezza Works
    1. What is Parallel Processing?
    2. The Basics of a Single Computer
    3. Netezza Parallel Processes Data
    4. Netezza is Born to be Parallel
    5. Starts with a Linux User, a Database User and A Database
    6. Each SPU holds a Portion of Every Table
    7. The Rows of a Table are Spread Across All SPUs
    8. The Brilliance of Netezza
    9. Compress Engine II – Adaptive Stream Compression
    10. FPGA Card and Zone Maps – The Netezza Secret Weapon
    11. How Data Might Look Like on a SPU
    12. Question – How Many Blocks Move Into Memory?
    13. Answer – How Many Blocks Move Into Memory?
    14. Quiz – Master that Query With the Zone Map
    15. Answer to Quiz – Master that Query With the Zone Map
    16. Netezza has Linear Scalability
    17. The Netezza Architecture
    18. The Host
    19. Netezza Disks
    20. There Are Three Options for Backup and Restore
    21. The IBM Netezza Family
  11. Chapter 2 – A Chip Off The Old Block
    1. Netezza Stores Data in Tables
    2. Each SPU is Assigned Specific Rows
    3. Each SPU Organizes the Rows inside a Data Block (Extent)
    4. SPUs Must Transfer Their Data Blocks to Memory
    5. As Tables Get Bigger the SPU uses Multiple Extents
    6. SPUs Process A Table One Block at a Time
    7. The Slowest Processing is a Full Table Scan
    8. The FPGA Card and the Zone Maps Eliminate Extents
    9. The FPGA Card and the Zone Map Enlightenment
    10. Netezza Systems Can Grow Forever
  12. Chapter 3 – How Netezza Distributes the Data
    1. Netezza Tables - Distribution Key or Random Distribution
    2. Table CREATE Examples with 4 different Distribution Keys
    3. Netezza Uses a Hash Formula
    4. The Hash Map Determines which SPU will own the Row
    5. The Hash Formula, Hash Map and SPU
    6. Placing rows on the SPU
    7. Placing rows on the SPU Continued
    8. A Review of the Hashing Process
    9. Like Data Hashes to the Same SPU
    10. Distribution Keys
    11. Distribution Key in WHERE Clause – 1 SPU Retrieve
    12. A Non-Unique Distribution Key
    13. Distribution Key in the WHERE Clause – 1 SPU Retrieve
    14. A conceptual example of a Multi-Column Distribution Key
    15. Distribution Key in the WHERE Clause – 1 SPU Retrieve
    16. A conceptual example of a Table with Random Distribution
    17. A Full Table Scan – or a Sequential Scan
    18. What happens when you forget the Distribution Key?
    19. Fully Qualifying an Object
    20. Checking for Skew
    21. Educate the Business on the Business by Sharing the Model
    22. Load Your Models and have the SQL Built Automatically
    23. Netezza is Massively Parallel
  13. Chapter 4 – Deep Dive Inside a Netezza Extent and Row
    1. Netezza Performance – Three Things to Keep in Mind
    2. How Netezza Allocates Data Storage
    3. Extents and Zone Maps
    4. How Data Might Look in an Extent
    5. Why Dates Are Generally Not Good Distribution Keys
    6. When a Table is Created, a Table Header is Created
    7. Every SPU has the Exact Same Tables
    8. All Netezza Tables are spread across All SPUs
    9. The Table Header and the Data Rows are Stored Separately
    10. A SPU Stores Rows of a Table inside a Data Block (Extent)
    11. To Read Rows, a SPU Moves the Data Block into Memory
    12. A Full Table Scan Means All SPUs must Read All Rows
    13. The “Achilles Heel”, or Slowest Process, is Block Transfer
    14. Each Table has a Distribution Key
    15. A Query Using the Distribution Key uses a Single SPU
    16. As Rows are Added, Another Extent is Added
    17. A Full Table Scan Means All SPUs Read All Blocks
    18. Distribution Key Query uses One SPU
    19. Using a CTAS to Improve Zone Map Selectivity
    20. How A CTAS with ORDER BY Improves Queries
    21. Each SPU Can Have Many Blocks for a Single Table
    22. A Full Table Scan Means All SPUs Read All Blocks
    23. Quiz – How Many Blocks Move into SPU Memory?
    24. Answer – How Many Blocks Move into SPU Memory?
    25. Netezza Rowid, CreateXid, DeleteXid
    26. An Update of Multiple Rows
    27. How to Undo an Update of Multiple Rows
    28. An Update Undo in Action
    29. A Delete Example
    30. A Delete Example Query
    31. How to Undo a Delete
    32. How to Undo a Delete In Action
    33. An Insert Example
    34. An Insert Example Query
    35. How to Undo an Insert
    36. How to Undo an Insert in Action
    37. What is the Purpose of the GROOM Command?
    38. The Groom Command Refreshes Zone Maps
    39. Groom Command Syntax
    40. Checking Groom Progress
    41. Automatic Database Statistics
    42. Drop Table, Truncate and Delete Compared
  14. Chapter 5 – How Joins Work Internally
    1. Netezza Join Quiz
    2. Netezza Join Quiz Answer
    3. Redistribution
    4. Duplication of the Smaller Table across All-SPUs
    5. If the Join Condition is the Distribution Key no Movement
    6. Matching Rows landed on SPU because of Distribution Keys
    7. What if the Join Condition Columns are Not Distribution Keys
    8. Quiz – Redistribute the Employees by their Dept_No
    9. Quiz – Employees’ Dept_No landed on SPU with Matches
    10. When Rows are on the same SPU they can be Joined
    11. Quiz – Redistribute the Orders to the Proper SPU
    12. Answer to Redistribute the Employees by their Dept_No Quiz
    13. A Visual of the Join in Action
    14. The Joining of Two Tables
    15. Netezza Moves Joining Rows to the Same SPU
    16. Imagine Joining Two Random Distribution Tables
    17. Both Tables are Redistributed to Join Rows on the Same SPU
    18. How do you join if One Table is Big and One Table is Small?
    19. Duplicate the Small Table on Every SPU (like a mirror)
    20. What Could You Do If Two Tables Joined 1000 Times a Day?
    21. Joining Two Tables with the same PK/FK Primary Index
    22. A Join With No Redistribution or Duplication
  15. Chapter 6 – CTAS and CBT
    1. CTAS (Create Table AS)
    2. Using the CTAS (Create Table AS) Table For Co-Location
    3. CTAS Facts
    4. Altering a CTAS Table to Rename It
    5. FPGA Card and Zone Maps – The Netezza Secret Weapon
    6. How A CTAS with ORDER BY Improves Queries
    7. A CTAS Major Sort Benefits over the Minor Sort
    8. A CBT (Cluster Based Table) Orders Data Without Precedence
    9. A CBT (Cluster Based Table) in Theory
    10. Creating a Cluster Based Table (CBT
    11. Creating a Temp Table as a Cluster Based Table (CBT)
    12. Comparing Extents That Are Sorted Vs. A CBT
    13. Benefits Of A Cluster Based Table (CBT)
    14. Altering a Cluster Based Table (CBT) back to a Normal Table
    15. GROOM Command is Used to Physically Change the Table
    16. After Creating a CBT, You Must GROOM The Table
    17. What The GROOM Does for a Table
    18. Groom Command Syntax
    19. Checking Groom Progress
    20. How to Know if your CBT Table Needs to be GROOMED?
  16. Chapter 7 - Temporary Tables
    1. There are Three Types of Temporary Tables
    2. The Same Derived Query shown Three Different Ways
    3. Most Derived Tables Are Used To Join To Other Tables
    4. Our Join Example With A Different Column Aliasing Style
    5. Our Join Example With The WITH Syntax
    6. Syntax For Creating A Temporary Table
    7. Creating and Populating a Temporary Table
    8. A Temporary Table in Action
    9. A Temporary Table Can Be Used Again and Again
    10. Alternative CREATE TEMPORARY TABLE Option
    11. A CTAS Temp Table to Improve Zone Map Selectivity
    12. Creating a Temp Table as a Cluster Based Table (CBT)
    13. What Are External Tables?
    14. External Tables Data Loading Formats
    15. External Table Log Files
    16. External Table Syntax
    17. Exporting Data Off of Netezza into an External Table
    18. Importing Data Into Netezza Using an External Table
    19. What is the Problem Here?
  17. Chapter 8 - Materialized Views
    1. A Materialized View
    2. Good Information to know about Materialized Views
    3. Syntax/Example to Create a Materialized View
    4. Replacing a Materialized View
    5. Zone Maps for Materialized Views
    6. Materialized View Restrictions
    7. Maintaining Materialized Views
    8. Materialized View Best Practices
  18. Chapter 9 – Collecting Statistics
    1. The Basics on Collecting Statistics
    2. Best Practices for Generating Statistics
    3. Syntax to Collect Statistics
    4. Syntax to Collect Express Statistics
    5. The Basics on Collecting Statistics
    6. Collecting Full Statistics
    7. Just-In-Time (JIT) Statistics
    8. How Netezza Collects Statistics on Small Tables
    9. How Netezza Collects Statistics on Medium Tables
    10. How Netezza Collects Statistics on Large Tables
    11. Generating Statistics using NzAdmin
    12. You Cannot Generate Statistics Within a Begin-End Block
  19. Chapter 10 – Using nzsql
    1. Using nzsql
    2. The nzsql Command Prompt
    3. Exporting Variable and then Using nzsql
    4. Connecting to another database through nzsql
    5. Displaying SQL User Session Variables
    6. Inserts, Updates and Deletes Show the Number of Rows
    7. Running a SQL Query from the nzsql Command Line
    8. Nzsql Options That Might Come In Handy
    9. Nzsql Internal Slash Options
    10. Nzsql For External Tables
    11. Why Would Anyone Use nzsql When They Can Use Nexus?
    12. The Nexus Super Join Builder Shows Tables Visually
    13. The Nexus Super Join Builder Builds the SQL Automatically
  20. Chapter 11 – Creating Tables
    1. CREATE TABLE Syntax
    2. Viewing the DDL
    3. Netezza Tables - Distribution Key or Random Distribution
    4. Table CREATE Examples with 4 different Distribution Keys
    5. The Worst Mistake You Can Make For A Distribution Key
    6. Good things to know about Table and Object Names
    7. Netezza Data Types
    8. Netezza Data Types in More Detail
    9. Netezza Data Type Extensions
    10. Reserved Names Within A Table
    11. How To Query and See Non-Active Rows
    12. Column Attributes
    13. Constraints
    14. Constraints
    15. Column Level Constraint Example
    16. Defining Constraints at the Table Level
    17. Utilizing Default Values for a Table
    18. CTAS (Create Table AS)
    19. CTAS Facts
    20. Using the CTAS (Create Table AS) Table For Co-Location
    21. Altering a CTAS Table to Rename It
    22. FPGA Card and Zone Maps – The Netezza Secret Weapon
    23. How A CTAS with ORDER BY Improves Queries
    24. A CTAS Major Sort Benefits over the Minor Sort
    25. Altering A Table
    26. Altering a Table Examples
    27. Drop Table, Truncate, and Delete Compared
  21. Chapter 12 – Creating Databases and Users and Managing Them
    1. Creating and Dropping a Netezza Database
    2. How to Determine the Database you are in?
    3. Netezza Users
    4. Altering a Netezza User
    5. Reserved Words to find out about a User
    6. Using Limit to bring back a Sample
    7. The Super-User is Named Admin
    8. Starts with a Linux User, a Database User and A Database
    9. Creating and Managing A Database
    10. Creating a User
    11. CREATE USER Syntax
    12. Forcing a Password Change for a User
    13. Reserved Words to find out about a User
    14. Altering a Netezza User
    15. How to Determine the Database you are in?
    16. Fully Qualifying a Database, Schema, and Table
    17. Options for Handling Invalid Schema Names
    18. An Example of Setting enable_schema_dbo_check
    19. Creating and Managing a Group
    20. Two Types of Permissions – Object and Admin
    21. Netezza SQL Maximums
    22. Admin Permissions
    23. Object Permissions
    24. Granting Object Permissions
    25. Granting Admin Permissions
    26. Table Permission Examples
    27. Querying Cross-Database
    28. Creating and Managing Synonyms
    29. Creating a Table With Comments
    30. Inserting Rows In A Table
  22. Chapter 13 – Systems Views
    1. _v_aggregate
    2. _v_database
    3. _v_datatype
    4. _v_function
    5. _v_group
    6. _v_groupusers
    7. _v_operator
    8. _v_procedure
    9. _v_relation_column_def
    10. _v_relation_keydata
    11. _v_sequence
    12. _v_session
    13. _v_table
    14. _v_table_dist_map
    15. _v_user
    16. _v_usergroups
    17. _v_view
    18. (System Administrators Only) _v_sys_index
    19. (System Administrators Only) _v_sys_priv
    20. (System Administrators Only) _v_sys_table
    21. (System Administrators Only) _v_sys_user_priv
    22. (System Administrators Only) _v_sys_view
  23. Chapter 14 – Explains
    1. EXPLAIN
    2. EXPLAIN Terms
    3. EXPLAIN Terms Continued
    4. EXPLAIN Syntax
    5. EXPLAIN Example
    6. EXPLAIN Verbose Example
    7. EXPLAIN Example For A Join
    8. EXPLAIN Verbose Example For A Join
    9. Good Advice - Join Tables by the Same Distribution Key
    10. EXPLAIN Verbose - Join With Matching Distribution Keys
    11. EXPLAIN DISTRIBUTION
    12. EXPLAIN PLANTEXT Example
    13. EXPLAIN PLANGRAPH Example