You are previewing Mastering the SAS DS2 Procedure.
O'Reilly logo
Mastering the SAS DS2 Procedure

Book Description

Enhance your SAS® data wrangling skills with high precision and parallel data manipulation using the new DS2 programming language. This book addresses the new DS2 programming language from SAS, which combines the precise procedural power and control of the Base SAS DATA step language with the simplicity and flexibility of SQL. DS2 provides simple, safe syntax for performing complex data transformations in parallel and enables manipulation of native database data types at full precision. It also introduces PROC FEDSQL, a modernized SQL language that blends perfectly with DS2. You will learn to harness the power of parallel processing to speed up CPU-intensive computing processes in Base SAS and how to achieve even more speed by processing DS2 programs on massively parallel database systems. Techniques for leveraging Internet APIs to acquire data, avoiding large data movements when working with data from disparate sources, and leveraging DS2’s new data types for full-precision numeric calculations are presented, with examples of why these techniques are essential for the modern data wrangler. While working through the code samples provided with this book, you will build a library of custom, reusable, and easily shareable DS2 program modules, execute parallelized DATA step programs to speed up a CPU-intensive process, and conduct advanced data transformations using hash objects and matrix math operations.

Table of Contents

  1. Foreword
  2. About This Book
  3. About the Author
  4. Chapter 1: Getting Started
  5. 1.1 Introduction
    1. 1.1.1 What to Expect from This Book
    2. 1.1.2 Prerequisite Knowledge
  6. 1.2 Accessing SAS and Setting Up for Practice
  7. 1.2.1 Getting Ready to Practice
  8. Chapter 2: Introduction to the DS2 Language
  9. 2.1 Introduction
  10. 2.2 DS2 Programming Basics
    1. 2.2.1 General Considerations
    2. 2.2.2 Program Structure
    3. 2.2.3 Program Blocks
    4. 2.2.4 Methods
    5. 2.2.5 System Methods
    6. 2.2.6 User-Defined Methods
    7. 2.2.7 Variable Identifiers and Scope
    8. 2.2.8 Data Program Execution
  11. 2.3 Converting a SAS DATA Step to a DS2 Data Program
    1. 2.3.1 A Traditional SAS DATA Step
    2. 2.3.2 Considerations
    3. 2.3.3 The Equivalent DS2 Data Program
  12. 2.4 Review of Key Concepts
  13. Chapter 3: DS2 Data Program Details
  14. 3.1 Introduction
  15. 3.2 DS2 Data Programs versus Base SAS DATA Steps
    1. 3.2.1 General Considerations
    2. 3.2.2 The “Six Subtle Dissimilarities”
    3. 3.2.3 DS2 “Missing” Features
  16. 3.3 Data Types in DS2
    1. 3.3.1 DS2 and ANSI Data Types
    2. 3.3.2 Automatic Data Type Conversion
    3. 3.3.3 Non-coercible Data Types
    4. 3.3.4 Processing SAS Missing and ANSI Null Values
  17. 3.4 Review of Key Concepts
  18. Chapter 4: User-Defined Methods and Packages
  19. 4.1 Introduction
  20. 4.2 Diving into User-Defined Methods
    1. 4.2.1 Overview
    2. 4.2.2 Designing a User-defined Method
  21. 4.3 User-Defined Packages
    1. 4.3.1 General Considerations
    2. 4.3.2 User-Defined Package Specifics
    3. 4.3.3 Object-Oriented Programming with DS2 Packages
  22. 4.4 Review of Key Concepts
  23. Chapter 5: Predefined Packages
  24. 5.1 Introduction
  25. 5.2 Executing FCMP Functions in DS2
    1. 5.2.1 The FCMP Package
    2. 5.2.2 FCMP Package Example
  26. 5.3 The Hash and Hiter (Hash Iterator) Packages
    1. 5.3.1 General
    2. 5.3.2 Hash Package Example
    3. 5.3.3 Hash Iterator Package Example
  27. 5.4 The HTTP and JSON Packages
    1. 5.4.1 General
    2. 5.4.2 HTTP Package Specifics
    3. 5.4.3 JSON Package Specifics
    4. 5.4.4 HTTP and JSON Packages Example
  28. 5.5 The Matrix Package
    1. 5.5.1 General
    2. 5.5.2 Matrix Package Example
  29. 5.6 The SQLSTMT Package
    1. 5.6.1 General
    2. 5.6.2 SQLSTMT Package Example
  30. 5.7 The TZ (Time Zone) Package
    1. 5.7.1 General
    2. 5.7.2 TZ Package Example
  31. 5.8 Review of Key Concepts
  32. Chapter 6: Parallel Processing in DS2
  33. 6.1 Introduction
  34. 6.2 Understanding Threaded Processing
    1. 6.2.1 The Need for Speed
    2. 6.2.2 Loading Data to and from RAM
    3. 6.2.3 Manipulating Data in RAM
  35. 6.3 DS2 Thread Programs
    1. 6.3.1 Writing DS2 Thread Programs
    2. 6.3.2 Parallel Processing Data with DS2 Threads
  36. 6.4 DS2 and the SAS In-Database Code Accelerator
    1. 6.4.1 Running DS2 Programs In-Database
  37. 6.5 Review of Key Concepts
  38. Chapter 7: Performance Tuning in DS2
  39. 7.1 Introduction
  40. 7.2 DS2_OPTIONS Statement
    1. 7.2.1 TRACE Option
  41. 7.3 Analyzing Performance with the SAS Log
    1. 7.3.1 Obtaining Performance Statistics
    2. 7.3.2 Analyzing Performance Statistics
    3. 7.3.3 Tuning Your Code
  42. 7.4 Learning and Trouble-Shooting Resources
    1. 7.4.1 SAS Learning Resources
    2. 7.4.2 SAS Support Communities
    3. 7.4.3 SAS Technical Support
  43. 7.5 Review of Key Concepts
  44. 7.6 Connecting with the Author
  45. Index