Carpenter's Guide to Innovative SAS Techniques

Book description

Carpenter's Guide to Innovative SAS Techniques offers advanced SAS programmers an all-in-one programming reference that includes advanced topics not easily found outside the depths of SAS documentation or more advanced training classes. Art Carpenter has written fifteen chapters of advanced tips and techniques, including topics on data summary, data analysis, and data reporting. Special emphasis is placed on DATA step techniques that solve complex data problems. There are numerous examples that illustrate advanced techniques that take advantage of formats, interface with the macro language, and utilize the Output Delivery System. Additional topics include operating system interfaces, table lookup techniques, and the creation of customized reports.

Table of contents

  1. Copyright
  2. Dedication
  3. Contents
  4. About This Book
  5. Acknowledgments
  6. About the Author
  7. Chapter 1 Moving, Copying, Importing, and Exporting Data
    1. 1.1 LIBNAME Statement Engines
      1. 1.1.1 Using Data Access Engines to Read and Write Data
      2. 1.1.2 Using the Engine to View the Data
      3. 1.1.3 Options Associated with the Engine
      4. 1.1.4 Replacing EXCEL Sheets
      5. 1.1.5 Recovering the Names of EXCEL Sheets
    2. 1.2 PROC IMPORT and EXPORT
      1. 1.2.1 Using the Wizard to Build Sample Code
      2. 1.2.2 Control through the Use of Options
      3. 1.2.3 PROC IMPORT Data Source Statements
      4. 1.2.4 Importing and Exporting CSV Files
      5. 1.2.5 Preventing the Export of Blank Sheets
      6. 1.2.6 Working with Named Ranges
    3. 1.3 DATA Step INPUT Statement
      1. 1.3.1 Format Modifiers for Errors
      2. 1.3.2 Format Modifiers for the INPUT Statement
      3. 1.3.3 Controlling Delimited Input
      4. 1.3.4 Reading Variable-Length Records
    4. 1.4 Writing Delimited Files
      1. 1.4.1 Using the DATA Step with the DLM= Option
      2. 1.4.2 PROC EXPORT
      3. 1.4.3 Using the %DS2CSV Macro
      4. 1.4.4 Using ODS and the CSV Destination
      5. 1.4.5 Inserting the Separator Manually
    5. 1.5 SQL Pass-Through
      1. 1.5.1 Adding a Pass-Through to Your SQL Step
      2. 1.5.2 Pass-Through Efficiencies
    6. 1.6 Reading and Writing to XML
      1. 1.6.1 Using ODS
      2. 1.6.2 Using the XML Engine
  8. Chapter 2 Working with Your Data
    1. 2.1 Data Set Options
      1. 2.1.1 REPLACE and REPEMPTY
      2. 2.1.2 Password Protection
      3. 2.1.3 KEEP, DROP, and RENAME Options
      4. 2.1.4 Observation Control Using FIRSTOBS and OBS Data Set Options
    2. 2.2 Evaluating Expressions
      1. 2.2.1 Operator Hierarchy
      2. 2.2.2 Using the Colon as a Comparison Modifier
      3. 2.2.3 Logical and Comparison Operators in Assignment Statements
      4. 2.2.4 Compound Inequalities
      5. 2.2.5 The MIN and MAX Operators
      6. 2.2.6 Numeric Expressions and Boolean Transformations
    3. 2.3 Data Validation and Exception Reporting
      1. 2.3.1 Date Validation
      2. 2.3.2 Writing to an Error Data Set
      3. 2.3.3 Controlling Exception Reporting with Macros
    4. 2.4 Normalizing - Transposing the Data
      1. 2.4.1 Using PROC TRANSPOSE
      2. 2.4.2 Transposing in the DATA Step
    5. 2.5 Filling Sparse Data
      1. 2.5.1 Known Template of Rows
      2. 2.5.2 Double Transpose
      3. 2.5.3 Using COMPLETYPES with PROC MEANS or PROC SUMMARY
      4. 2.5.4 Using CLASSDATA
      5. 2.5.5 Using Preloaded Formats
      6. 2.5.6 Using the SPARSE Option with PROC FREQ
    6. 2.6 Some General Concepts
      1. 2.6.1 Shorthand Variable Naming
      2. 2.6.2 Understanding the ORDER= Option
      3. 2.6.3 Quotes within Quotes within Quotes
      4. 2.6.4 Setting the Length of Numeric Variables
    7. 2.7 WHERE Specifics
      1. 2.7.1 Operators Just for the WHERE
      2. 2.7.2 Interaction with the BY Statement
    8. 2.8 Appending Data Sets
      1. 2.8.1 Appending Data Sets Using the DATA Step and SQL UNION
      2. 2.8.2 Using the DATASETS Procedure’s APPEND Statement
    9. 2.9 Finding and Eliminating Duplicates
      1. 2.9.1 Using PROC SORT
      2. 2.9.2 Using FIRST. and LAST. BY-Group Processing
      3. 2.9.3 Using PROC SQL
      4. 2.9.4 Using PROC FREQ
      5. 2.9.5 Using the Data Component Hash Object
    10. 2.10 Working with Missing Values
      1. 2.10.1 Special Missing Values
      2. 2.10.2 MISSING System Option
      3. 2.10.3 Using the CMISS, NMISS, and MISSING Functions
      4. 2.10.4 Using the CALL MISSING Routine
      5. 2.10.5 When Classification Variables Are Missing
      6. 2.10.6 Missing Values and Macro Variables
      7. 2.10.7 Imputing Missing Values
  9. Chapter 3 Just In the DATA Step
    1. 3.1 Working across Observations
      1. 3.1.1 BY-Group Processing—Using FIRST. and LAST. Processing
      2. 3.1.2 Transposing to ARRAYs
      3. 3.1.3 Using the LAG Function
      4. 3.1.4 Look-Ahead Using a MERGE Statement
      5. 3.1.5 Look-Ahead Using a Double SET Statement
      6. 3.1.6 Look-Back Using a Double SET Statement
      7. 3.1.7 Building a FIFO Stack
      8. 3.1.8 A Bit on the SUM Statement
    2. 3.2 Calculating a Person’s Age
      1. 3.2.1 Simple Formula
      2. 3.2.2 Using Functions
      3. 3.2.3 The Way Society Measures Age
    3. 3.3 Using DATA Step Component Objects
      1. 3.3.1 Declaring (Instantiating) the Object
      2. 3.3.2 Using Methods with an Object
      3. 3.3.3 Simple Sort Using the HASH Object
      4. 3.3.4 Stepping through a Hash Table
      5. 3.3.5 Breaking Up a Data Set into Multiple Data Sets
      6. 3.3.6 Hash Tables That Reference Hash Tables
      7. 3.3.7 Using a Hash Table to Update a Master Data Set
    4. 3.4 Doing More with the INTNX and INTCK Functions
      1. 3.4.1 Interval Multipliers
      2. 3.4.2 Shift Operators
      3. 3.4.3 Alignment Options
      4. 3.4.4 Automatic Dates
    5. 3.5 Variable Conversions
      1. 3.5.1 Using the PUT and INPUT Functions
      2. 3.5.2 Decimal, Hexadecimal, and Binary Number Conversions
    6. 3.6 DATA Step Functions
      1. 3.6.1 The ANY and NOT Families of Functions
      2. 3.6.2 Comparison Functions
      3. 3.6.3 Concatenation Functions
      4. 3.6.4 Finding Maximum and Minimum Values
      5. 3.6.5 Variable Information Functions
      6. 3.6.6 New Alternatives and Functions That Do More
      7. 3.6.7 Functions That Put the Squeeze on Values
    7. 3.7 Joins and Merges
      1. 3.7.1 BY Variable Attribute Consistency
      2. 3.7.2 Variables in Common That Are Not in the BY List
      3. 3.7.3 Repeating BY Variables
      4. 3.7.4 Merging without a Clear Key (Fuzzy Merge)
    8. 3.8 More on the SET Statement
      1. 3.8.1 Using the NOBS= and POINT= Options
      2. 3.8.2 Using the INDSNAME= Option
      3. 3.8.3 A Comment on the END= Option
      4. 3.8.4 DATA Steps with Two SET Statements
    9. 3.9 Doing More with DO Loops
      1. 3.9.1 Using the DOW Loop
      2. 3.9.2 Compound Loop Specifications
      3. 3.9.3 Special Forms of Loop Specifications
    10. 3.10 More on Arrays
      1. 3.10.1 Array Syntax
      2. 3.10.2 Temporary Arrays
      3. 3.10.3 Functions Used with Arrays
      4. 3.10.4 Implicit Arrays
  10. Chapter 4 Sorting the Data
    1. 4.1 PROC SORT Options
      1. 4.1.1 The NODUPREC Option
      2. 4.1.2 The DUPOUT= Option
      3. 4.1.3 The TAGSORT Option
      4. 4.1.4 Using the SORTSEQ Option
      5. 4.1.5 The FORCE Option
      6. 4.1.6 The EQUALS or NOEQUALS Options
    2. 4.2 Using Data Set Options with PROC SORT
    3. 4.3 Taking Advantage of Known or Knowable Sort Order
    4. 4.4 Metadata Sort Information
    5. 4.5 Using Threads
  11. Chapter 5 Working with Data Sets
    1. 5.1 Automating the COMPARE Process
    2. 5.2 Reordering Variables on the PDV
    3. 5.3 Building and Maintaining Indexes
      1. 5.3.1 Introduction to Indexing
      2. 5.3.2 Creating Simple Indexes
      3. 5.3.3 Creating Composite Indexes
      4. 5.3.4 Using the IDXWHERE and IDXNAME Options
      5. 5.3.5 Index Caveats and Considerations
    4. 5.4 Protecting Passwords
      1. 5.4.1 Using PROC PWENCODE
      2. 5.4.2 Protecting Database Passwords
    5. 5.5 Deleting Data Sets
    6. 5.6 Renaming Data Sets
      1. 5.6.1 Using the RENAME Function
      2. 5.6.2 Using PROC DATASETS
  12. Chapter 6 Table Lookup Techniques
    1. 6.1 A Series of IF Statements—The Logical Lookup
    2. 6.2 IF -THEN/ELSE Lookup Statements
    3. 6.3 DATA Step Merges and SQL Joins
    4. 6.4 Merge Using Double SET Statements
    5. 6.5 Using Formats
    6. 6.6 Using Indexes
      1. 6.6.1 Using the BY Statement
      2. 6.6.2 Using the KEY= Option
    7. 6.7 Key Indexing (Direct Addressing)—Using Arrays to Form a Simple Hash
      1. 6.7.1 Building a List of Unique Values
      2. 6.7.2 Performing a Key Index Lookup
      3. 6.7.3 Using a Non-Numeric Index
    8. 6.8 Using the HASH Object
    9. 7.1 Using Multiple CLASS Statements and CLASS Statement Options
      1. 7.1.1 MISSING and DESCENDING Options
      2. 7.1.2 GROUPINTERNAL Option
      3. 7.1.3 Order= Option
    10. 7.2 Letting SAS Name the Output Variables
    11. 7.3 Statistic Specification on the OUTPUT Statement
    12. 7.4 Identifying the Extremes
      1. 7.4.1 Using the MAXID and MINID Options
      2. 7.4.2 Using the IDGROUP Option
      3. 7.4.3 Using Percentiles to Create Subsets
    13. 7.5 Understanding the _TYPE_ Variable
    14. 7.6 Using the CHARTYPE Option
    15. 7.7 Controlling Summary Subsets Using the WAYS Statement
    16. 7.8 Controlling Summary Subsets Using the TYPES 
 Statement
    17. 7.9 Controlling Subsets Using the CLASSDATA= and EXCLUSIVE Options
    18. 7.10 Using the COMPLETETYPES Option
    19. 7.11 Identifying Summary Subsets Using the LEVELS and WAYS Options
    20. 7.12 CLASS Statement vs. BY Statement
  13. Chapter 8 Other Reporting and Analysis Procedures
    1. 8.1 Expanding PROC TABULATE
      1. 8.1.1 What You Need to Know to Get Started
      2. 8.1.2 Calculating Percentages Using PROC TABULATE
      3. 8.1.3 Using the STYLE= Option with PROC TABULATE
      4. 8.1.4 Controlling Table Content with the CLASSDATA Option
      5. 8.1.5 Ordering Classification Level Headings
    2. 8.2 Expanding PROC UNIVARIATE
      1. 8.2.1 Generating Presentation-Quality Plots
      2. 8.2.2 Using the CLASS Statement
      3. 8.2.3 Probability and Quantile Plots
      4. 8.2.4 Using the OUTPUT Statement to Calculate Percentages
    3. 8.3 Doing More with PROC FREQ
      1. 8.3.1 OUTPUT Statement in PROC FREQ
      2. 8.3.2 Using the NLEVELS Option
    4. 8.4 Using PROC REPORT to Better Advantage
      1. 8.4.1 PROC REPORT vs. PROC TABULATE
      2. 8.4.2 Naming Report Items (Variables) in the Compute Block
      3. 8.4.3 Understanding Compute Block Execution
      4. 8.4.4 Using a Dummy Column to Consolidate Compute Blocks
      5. 8.4.5 Consolidating Columns
      6. 8.4.6 Using the STYLE= Option with LINES
      7. 8.4.7 Setting Style Attributes with the CALL DEFINE Routine
      8. 8.4.8 Dates within Dates
      9. 8.4.9 Aligning Decimal Points
      10. 8.4.10 Conditionally Executing the LINE Statement
    5. 8.5 Using PROC PRINT
      1. 8.5.1 Using the ID and BY Statements Together
      2. 8.5.2 Using the STYLE= Option with PROC PRINT
      3. 8.5.3 Using PROC PRINT to Generate a Table of Contents
  14. Chapter 9 SAS/GRAPH Elements You Should Know—Even if You Don’t Use SAS/GRAPH
    1. 9.1 Using Title Options with ODS
    2. 9.2 Setting and Clearing Graphics Options and Settings
    3. 9.3 Using SAS/GRAPH Statements with Procedures That Are Not SAS/GRAPH Procedures
      1. 9.3.1 Changing Plot Symbols with the SYMBOL Statement
      2. 9.3.2 Controlling Axes and Legends
    4. 9.4 Using ANNOTATE to Augment Graphs
  15. Chapter 10 Presentation Graphics—More than Just SAS/GRAPH6
    1. 10.1 Generating Box Plots
      1. 10.1.1 Using PROC BOXPLOT
      2. 10.1.2 Using PROC GPLOT and the SYMBOL Statement
      3. 10.1.3 Using PROC SHEWHART
    2. 10.2 SAS/GRAPH Specialty Techniques and Procedures
      1. 10.2.1 Building Your Own Graphics Font
      2. 10.2.2 Splitting a Text Line Using JUSTIFY=
      3. 10.2.3 Using Windows Fonts
      4. 10.2.4 Using PROC GKPI
    3. 10.3 PROC FREQ Graphics
  16. Chapter 11 Output Delivery System
    1. 11.1 Using the OUTPUT Destination
      1. 11.1.1 Determining Object Names
      2. 11.1.2 Creating a Data Set
      3. 11.1.3 Using the MATCH_ALL Option
      4. 11.1.4 Using the PERSIST= Option
      5. 11.1.5 Using MATCH_ALL= with the PERSIST= Option
    2. 11.2 Writing Reports to Excel
      1. 11.2.1 EXCELXP Tagset Documentation and Options
      2. 11.2.2 Generating Multisheet Workbooks
      3. 11.2.3 Checking Out the Styles
    3. 11.3 Inline Formatting Using Escape Character Sequences
      1. 11.3.1 Page X of Y
      2. 11.3.2 Superscripts, Subscripts, and a Dagger
      3. 11.3.3 Changing Attributes
      4. 11.3.4 Using Sequence Codes to Control Indentations, Spacing, and Line Breaks
      5. 11.3.5 Issuing Raw RTF Specific Commands
    4. 11.4 Creating Hyperlinks
      1. 11.4.1 Using Style Overrides to Create Links
      2. 11.4.2 Using the LINK= TITLE Statement Option
      3. 11.4.3 Linking Graphics Elements
      4. 11.4.4 Creating Internal Links
    5. 11.5 Traffic Lighting
      1. 11.5.1 User-Defined Format
      2. 11.5.2 PROC TABULATE
      3. 11.5.3 PROC REPORT
      4. 11.5.4 Traffic Lighting with PROC PRINT
    6. 11.6 The ODS LAYOUT Statement
    7. 11.7 A Few Other Useful ODS Tidbits
      1. 11.7.1 Using the ASIS Style Attribute
      2. 11.7.2 ODS RESULTS Statement
  17. Chapter 12 Taking Advantage of Formats
    1. 12.1 Using Preloaded Formats to Modify Report Contents
      1. 12.1.1 Using Preloaded Formats with PROC REPORT
      2. 12.1.2 Using Preloaded Formats with PROC TABULATE
      3. 12.1.3 Using Preloaded Formats with the MEANS and SUMMARY Procedures
    2. 12.2 Doing More with Picture Formats
      1. 12.2.1 Date Directives and the DATATYPE Option
      2. 12.2.2 Working with Fractional Values
      3. 12.2.3 Using the MULT and PREFIX Options
      4. 12.2.4 Display Granularity Based on Value Ranges – Limiting Significant Digits
    3. 12.3 Multilabel (MLF) Formats
      1. 12.3.1 A Simple MLF
      2. 12.3.2 Calculating Rolling Averages
    4. 12.4 Controlling Order Using the NOTSORTED Option
    5. 12.5 Extending the Use of Format Translations
      1. 12.5.1 Filtering Missing Values
      2. 12.5.2 Mapping Overlapping Ranges
      3. 12.5.3 Handling Text within Numeric Values
      4. 12.5.4 Using Perl Regular Expressions within Format Definitions
      5. 12.5.5 Passing Values to a Function as a Format Label
    6. 12.6 ANYDATE Informats
      1. 12.6.1 Reading in Mixed Dates
      2. 12.6.2 Converting Mixed DATETIME Values
    7. 12.7 Building Formats from Data Sets
    8. 12.8 Using the PVALUE Format
    9. 12.9 Format Libraries
      1. 12.9.1 Saving Formats Permanently
      2. 12.9.2 Searching for Formats
      3. 12.9.3 Concatenating Format Catalogs and Libraries
  18. Chapter 13 Interfacing with the Macro Language
    1. 13.1 Avoiding Macro Variable Collisions—Make Your Macro Variables %Local
    2. 13.2 Using the SYMPUTX Routine
      1. 13.2.1 Compared to CALL SYMPUT
      2. 13.2.2 Using SYMPUTX to Save Values of Options
      3. 13.2.3 Using SYMPUTX to Build a List of Macro Variables
    3. 13.3 Generalized Programs—Variations on a Theme
      1. 13.3.1 Steps to the Generalization of a Program
      2. 13.3.2 Levels of Generalization and Levels of Macro Language Understanding
    4. 13.4 Utilizing Macro Libraries
      1. 13.4.1 Establishing an Autocall Library
      2. 13.4.2 Tracing Autocall Macro Locations
      3. 13.4.3 Using Stored Compiled Macro Libraries
      4. 13.4.4 Macro Library Search Order
    5. 13.5 Metadata-Driven Programs
      1. 13.5.1 Processing across Data Sets
      2. 13.5.2 Controlling Data Validations
    6. 13.6 Hard Coding—Just Don’t Do It
    7. 13.7 Writing Macro Functions
    8. 13.8 Macro Information Sources
      1. 13.8.1 Using SASHELP and Dictionary tables
      2. 13.8.2 Retrieving System Options and Settings
      3. 13.8.3 Accessing the Metadata of a SAS Data Set
    9. 13.9 Macro Security and Protection
      1. 13.9.1 Hiding Macro Code
      2. 13.9.2 Executing a Specific Macro Version
    10. 13.10 Using the Macro Language IN Operator
      1. 13.10.1 What Can Go Wrong
      2. 13.10.2 Using the MINOPERATOR Option
      3. 13.10.3 Using the MINDELIMITER= Option
      4. 13.10.4 Compilation vs. Execution for these Options
    11. 13.11 Making Use of the MFILE System Option
    12. 13.12 A Bit on Macro Quoting
  19. Chapter 14 Operating System Interface and Environmental Control
    1. 14.1 System Options
      1. 14.1.1 Initialization Options
      2. 14.1.2 Data Processing Options
      3. 14.1.3 Saving SAS System Options
    2. 14.2 Using an AUTOEXEC Program
    3. 14.3 Using the Configuration File
      1. 14.3.1 Changing the SASAUTOS Location
      2. 14.3.2 Controlling DM Initialization
    4. 14.4 In the Display Manager
      1. 14.4.1 Showing Column Names in ViewTable
      2. 14.4.2 Using the DM Statement
      3. 14.4.3 Enhanced Editor Options and Shortcuts
      4. 14.4.4 Macro Abbreviations for the Enhanced Editor
      5. 14.4.5 Adding Tools to the Application Tool Bar
      6. 14.4.6 Adding Tools to Pull-Down and Pop-up Menus
      7. 14.4.7 Adding Tools to the KEYS List
    5. 14.5 Using SAS to Write and Send E-mails
    6. 14.6 Recovering Physical Location Information
      1. 14.6.1 Using the PATHNAME Function
      2. 14.6.2 SASHELP VIEWS and DICTIONARY Tables
      3. 14.6.3 Determining the Executing Program Name and Path
      4. 14.6.4 Retrieving the UNC (Universal Naming Convention) Path
  20. Chapter 15 Miscellaneous Topics
    1. 15.1 A Few Miscellaneous Tips
      1. 15.1.1 Customizing Your NOTEs, WARNINGs, and ERRORs
      2. 15.1.2 Enhancing Titles and Footnotes with the #BYVAL and #BYVAR Options
      3. 15.1.3 Executing OS Commands
    2. 15.2 Creating User-defined Functions Using PROC FCMP
      1. 15.2.1 Building Your Own Functions
      2. 15.2.2 Storing and Accessing Your Functions
      3. 15.2.3 Interaction with the Macro Language
      4. 15.2.4 Viewing Function Definitions
      5. 15.2.5 Removing Functions
    3. 15.3 Reading RTF as Data
      1. 15.3.1 RTF Diagram Completion
      2. 15.3.2 Template Preparation
      3. 15.3.3 RTF as Data
  21. Appendix A Topical Index
  22. Appendix B Usage Index
    1. Global Statements and Options
      1. Statements, Global
    2. Macro Language
      1. GOPTIONS, Graphics
    3. Options, System
      1. Options, Data Set
    4. Procedures: Steps, Statements, and Options
      1. Procedures
    5. DATA Step: Statements and Options
      1. Statements, DATA Step
      2. Format Modifiers
      3. Functions
      4. Hash Object
    6. Output Delivery System, ODS
      1. ODS Destinations and Tagsets
      2. ODS Attributes
      3. ODS Options
      4. ODS Statements
    7. SAS Display Manager
      1. Display Manager Commands
  23. References
    1. User Publications
    2. Generally Good Reading – Lots More to Learn
  24. Index
  25. Accelerate Your SAS Knowledge with SAS Books

Product information

  • Title: Carpenter's Guide to Innovative SAS Techniques
  • Author(s): Art Carpenter
  • Release date: March 2012
  • Publisher(s): SAS Institute
  • ISBN: 9781612902029