Book description
Hash tables can do a lot more than you might think! Data Management Solutions Using SAS Hash Table Operations: A Business Intelligence Case Study concentrates on solving your challenging data management and analysis problems via the power of the SAS hash object, whose environment and tools make it possible to create complete dynamic solutions. To this end, this book provides an in-depth overview of the hash table as an in-memory database with the CRUD (Create, Retrieve, Update, Delete) cycle rendered by the hash object tools. By using this concept and focusing on real-world problems exemplified by sports data sets and statistics, this book seeks to help you take advantage of the hash object productively, in particular, but not limited to, the following tasks:- select proper hash tools to perform hash table operations
- use proper hash table operations to support specific data management tasks
- use the dynamic, run-time nature of hash object programming
- understand the algorithmic principles behind hash table data look-up, retrieval, and aggregation
- learn how to perform data aggregation, for which the hash object is exceptionally well suited
- manage the hash table memory footprint, especially when processing big data
- use hash object techniques for other data processing tasks, such as filtering, combining, splitting, sorting, and unduplicating.
Table of contents
- About This Book
- About These Authors
- Acknowledgments
-
Part One—The HOW of the SAS Hash Object
- Chapter 1: Hash Object Essentials
-
Chapter 2: Table-Level Operations
- 2.1 Introduction
-
2.2 CREATE Operation
- 2.2.1 Declaring a Hash Object
- 2.2.2 Creating a Hash Object Instance
- 2.2.3 Combining Declaration and Instantiation
- 2.2.4 Defining Hash Table Variables
- 2.2.5 Omitting the DEFINEDATA Method
- 2.2.6 Wrapping Up the Create Operation
- 2.2.7 PDV Host Variables and Parameter Type Matching
- 2.2.8 Other Ways of Hard-Coded Parameter Type Matching
- 2.2.9 Dynamic Parameter Type Matching via File Reference
- 2.2.10 Parameter Type Matching by Forced File Reference
- 2.2.11 Parameter Type Matching by Default File Reference
- 2.2.12 Defining Multiple Hash Variables
- 2.2.13 Defining Hash Variables as Non-Literal Expressions
- 2.2.14 Defining Hash Variables Dynamically One at a Time
- 2.2.15 Defining Hash Variables Using Metadata
- 2.2.16 Multiple Instances Issue
- 2.2.17 Ensuring Single Instance Usage
- 2.2.18 Handling Multiple Instances
- 2.2.19 Create Operation Hash Tools
- 2.3 DELETE (Table) Operation
- 2.4 CLEAR Operation
-
2.5 OUTPUT Operation
- 2.5.1 The OUTPUT Method
- 2.5.2 Open-Write-Close Cycle
- 2.5.3 Open-Write-Close Cycle Encapsulation
- 2.5.4 Avoiding Open File Conflicts
- 2.5.5 Output Data Set Member Types
- 2.5.6 Creating and Overwriting Output Data Set
- 2.5.7 Using Output Data Set Options
- 2.5.8 DATASET Argument as Non-Literal Expression
- 2.5.9 Output Data Order
- 2.5.10 Output Operation Hash Tools
- 2.6 DESCRIBE Operation
-
Chapter 3: Item-Level Operations: Direct Access
- 3.1 Introduction
- 3.2 SEARCH (Pure LookUp) Operation
-
3.3 INSERT Operation
- 3.3.1 Dynamic Memory Acquisition
- 3.3.2 Implicit INSERT
- 3.3.3 Implicit INSERT: Method Call Mode
- 3.3.4 Implicit INSERT: Methods Other Than ADD
- 3.3.5 Implicit INSERT: Argument Tag Mode
- 3.3.6 Explicit INSERT
- 3.3.7 Explicit INSERT Rules
- 3.3.8 Implicit vs Explicit INSERT
- 3.3.9 Unique Key and Duplicate Key INSERT
- 3.3.10 Unique INSERT
- 3.3.11 Duplicate INSERT
- 3.3.12 Insertion Order
- 3.3.13 Insert Operation Hash Tools
- 3.3.14 INSERT Operation Hash-PDV Interaction
- 3.4 DELETE ALL Operation
- 3.5 RETRIEVE Operation
- 3.6 UPDATE ALL Operation
-
3.7 ORDER Operation
- 3.7.1 ORDER Operation Invocation
- 3.7.2 ORDERED Argument Tag Plasticity
- 3.7.3 Hash Items vs Hash Item Groups
- 3.7.4 OUTPUT Operation Effects
- 3.7.5 General Hash Table Order Principle
- 3.7.6 Ordering by Composite Keys
- 3.7.7 Setting the SORTEDBY= Option
- 3.7.8 ORDER Operation Hash Tools
- 3.7.9 ORDER Operation Hash-PDV Interaction
-
Chapter 4: Item-Level Operations: Enumeration
- 4.1 Introduction
- 4.2 Enumeration: Basics and Classification
-
4.3 KEYNUMERATE Operation
- 4.3.1 KeyNumerate Operation Mechanics
- 4.3.2 FIND_NEXT: Implicit vs Explicit
- 4.3.3 Other KeyNumerate Coding Styles
- 4.3.4 Version 9.4 Add-On: DO_OVER
- 4.3.5 Forward and Backward, In and Out
- 4.3.6 Staying within the Item List (Keeping It Set)
- 4.3.7 HAS_NEXT and HAS_PREV Peculiarities
- 4.3.8 Harvesting Hash Items
- 4.3.9 Harvesting Hash Items via Explicit Calls
- 4.3.10 Selective DELETE and UPDATE Operations
- 4.3.11 Selective DELETE: Single Item
- 4.3.12 Selective Delete: Multiple Items
- 4.3.13 Selective UPDATE
- 4.3.14 Selective DELETE vs Selective UPDATE
- 4.3.15 KeyNumerate Operation Hash Tools
- 4.3.16 KeyNumerate Operation Hash-PDV Interaction
-
4.4 ENUMERATE ALL Operation
- 4.4.1 The Hash Iterator Object
- 4.4.2 Creating and Linking the Iterator Object
- 4.4.3 Hash Iterator Pointer
- 4.4.4 Direct Iterator Access: First Item
- 4.4.5 Direct Iterator Access: Last Item
- 4.4.6 Direct Iterator Access: Key-Item
- 4.4.7 Sequential Access
- 4.4.8 Enumerating from the End Points
- 4.4.9 Iterator Priming Using NEXT and PREV
- 4.4.10 FIRST/LAST vs NEXT/PREV
- 4.4.11 Keeping the Iterator in the Table
- 4.4.12 Enumerating Sequentially from a Key-Item
- 4.4.13 Harvesting Same-Key Items from a Key-Item
- 4.4.14 The Hash Iterator and Item Locking
- 4.4.15 Locking and Unlocking
- 4.4.16 Locking Same-Key Item Groups
- 4.4.17 Locking the Entire Hash Table
- 4.4.18 ENUMERATE ALL Operation Hash Tools
- 4.4.19 Hash-PDV Interaction
-
Part Two—The WHAT and the WHY of the SAS Hash Object
- Chapter 5: Bizarro Ball Sample Data
- Chapter 6: Data Tasks Using Hash Table Operations
- Chapter 7: Supporting Data Warehouse Star Schemas
- Chapter 8: Creating Data Aggregates and Metrics
-
Part Three—Expanding the WHAT and the WHY, along with the HOW of the SAS Hash Object
- Chapter 9: Hash of Hashes – Looping Through SAS Hash Objects
- Chapter 10: The Hash Object as a Dynamic Data Structure
-
Chapter 11: Hash Object Memory Management
- 11.1 Introduction
- 11.2 Memory vs. Disk Trade-Off
- 11.3 Making Use of Existing Key Order
- 11.4 MD5 Hash Key Reduction
- 11.5 Data Portion Offload (Hash Index)
- 11.6 Uniform Input Split
- 11.7 Uniform MD5 Split On the Fly
- 11.8 Uniform Split Using a SAS Index
- 11.9 Combining Hash Memory-Saving Techniques
- 11.10 MD5 Argument Concatenation Ins and Outs
- 11.11 Summary
- Part Four—Wrapping up: Two Case Studies
- Index
Product information
- Title: Data Management Solutions Using SAS Hash Table Operations
- Author(s):
- Release date: July 2018
- Publisher(s): SAS Institute
- ISBN: 9781635260595
You might also like
book
SAS Data Analytic Development
Design quality SAS software and evaluate SAS software quality SAS Data Analytic Development is the developer’s …
book
Implementing CDISC Using SAS, 2nd Edition
For decades researchers and programmers have used SAS to analyze, summarize, and report clinical trial data. …
book
Advanced SQL with SAS
This book introduces advanced techniques for using PROC SQL in SAS. If you are a SAS …
book
Decision Trees for Analytics Using SAS Enterprise Miner
Decision Trees for Analytics Using SAS Enterprise Miner is the most comprehensive treatment of decision tree …