You are previewing Learning SAS by Example.
O'Reilly logo
Learning SAS by Example

Book Description

If you like learning by example, then this straightforward book makes it easy to learn SAS programming. In an instructive and conversational tone, author Ron Cody clearly explains each programming technique and then illustrates it with one or more real-life examples, followed by a detailed description of how the program works. The text is divided into four major sections: Getting Started; DATA Step Processing; Presenting and Summarizing Your Data; and Advanced Topics. Subjects addressed include: Reading data from external sources Learning details of DATA step programming Subsetting and combining SAS data sets Understanding SAS functions and working with arrays Creating reports with PROC REPORT and PROC TABULATE Learning to use the SAS Output Delivery System Getting started with the SAS macro language Introducing PROC SQL You can test your knowledge and hone your skills by solving the problems at the end of each chapter. (Solutions to odd-numbered problems are located at the back of this book. Solutions to all problems are available to instructors by visiting the book's companion Web site for details.) This book is intended for beginners and intermediate users. Readers should know how to enter and submit a SAS program from their operating system. Includes a free CD-ROM with the example code, data sets, and solutions to odd-numbered problems.

Table of Contents

  1. Copyright
  2. Praise from the Experts
  3. List of Programs
    1. Programs in Chapter 1
    2. Programs in Chapter 2
    3. Programs in Chapter 3
    4. Programs in Chapter 4
    5. Programs in Chapter 5
    6. Programs in Chapter 6
    7. Programs in Chapter 7
    8. Programs in Chapter 8
    9. Programs in Chapter 9
    10. Programs in Chapter 10
    11. Programs in Chapter 11
    12. Programs in Chapter 12
    13. Programs in Chapter 13
    14. Programs in Chapter 14
    15. Programs in Chapter 15
    16. Programs in Chapter 16
    17. Programs in Chapter 17
    18. Programs in Chapter 18
    19. Programs in Chapter 19
    20. Programs in Chapter 20
    21. Programs in Chapter 21
    22. Programs in Chapter 22
    23. Programs in Chapter 23
    24. Programs in Chapter 24
    25. Programs in Chapter 25
    26. Programs in Chapter 26
  4. Preface
  5. Acknowledgments
  6. 1. Getting Started
    1. 1. What Is SAS?
      1. 1.1. Introduction
      2. 1.2. Getting Data into SAS
      3. 1.3. A Sample SAS Program
      4. 1.4. SAS Names
      5. 1.5. SAS Data Sets and SAS Data Types
      6. 1.6. The SAS Display Manager and SAS Enterprise Guide
      7. 1.7. Problems
    2. 2. Writing Your First SAS Program
      1. 2.1. A Simple Program to Read Raw Data and Produce a Report
      2. 2.2. Enhancing the Program
      3. 2.3. More on Comment Statements
      4. 2.4. How SAS Works (a Look Inside the “Black Box”)
      5. 2.5. Problems
  7. 2. DATA Step Processing
    1. 3. Reading Raw Data from External Files
      1. 3.1. Introduction
      2. 3.2. Reading Data Values Separated by Blanks
      3. 3.3. Specifying Missing Values with List Input
      4. 3.4. Reading Data Values Separated by Commas (CSV Files)
      5. 3.5. Using an Alternative Method to Specify an External File
      6. 3.6. Reading Data Values Separated by Delimiters Other Than Blanks or Commas
      7. 3.7. Placing Data Lines Directly in Your Program (the DATALINES Statement)
      8. 3.8. Specifying INFILE Options with the DATALINES Statement
      9. 3.9. Reading Raw Data from Fixed Columns—Method 1: Column Input
      10. 3.10. Reading Raw Data from Fixed Columns—Method 2: Formatted Input
      11. 3.11. Using a FORMAT Statement in a DATA Step versus in a Procedure
      12. 3.12. Using Informats with List Input
      13. 3.13. Supplying an INFORMAT Statement with List Input
      14. 3.14. Using List Input with Embedded Delimiters
      15. 3.15. Problems
    2. 4. Creating Permanent SAS Data Sets
      1. 4.1. Introduction
      2. 4.2. SAS Libraries—The LIBNAME Statement
      3. 4.3. Why Create Permanent SAS Data Sets?
      4. 4.4. Examining the Descriptor Portion of a SAS Data Set Using PROC CONTENTS
      5. 4.5. Listing All the SAS Data Sets in a SAS Library Using PROC CONTENTS
      6. 4.6. Viewing the Descriptor Portion of a SAS Data Set Using the SAS Explorer
      7. 4.7. Viewing the Data Portion of a SAS Data Set Using PROC PRINT
      8. 4.8. Viewing the Data Portion of a SAS Data Set Using the SAS VIEWTABLE Window
      9. 4.9. Using a SAS Data Set as Input to a DATA Step
      10. 4.10. DATA _NULL_: A Data Set That Isn’t
      11. 4.11. Problems
    3. 5. Creating Formats and Labels
      1. 5.1. Adding Labels to Your Variables
      2. 5.2. Using Formats to Enhance Your Output
      3. 5.3. Regrouping Values Using Formats
      4. 5.4. More on Format Ranges
      5. 5.5. Storing Your Formats in a Format Library
      6. 5.6. Permanent Data Set Attributes
      7. 5.7. Accessing a Permanent SAS Data Set with User-Defined Formats
      8. 5.8. Displaying Your Format Definitions
      9. 5.9. Problems
    4. 6. Reading and Writing Data from an Excel Spreadsheet
      1. 6.1. Introduction
      2. 6.2. Using the Import Wizard to Convert a Spreadsheet to a SAS Data Set
      3. 6.3. Creating an Excel Spreadsheet from a SAS Data Set
      4. 6.4. Using an Engine to Read an Excel Spreadsheet
      5. 6.5. Using the SAS Output Delivery System to Convert a SAS Data Set to an Excel Spreadsheet
      6. 6.6. Problems
    5. 7. Performing Conditional Processing
      1. 7.1. Introduction
      2. 7.2. The IF and ELSE IF Statements
      3. 7.3. The Subsetting IF Statement
      4. 7.4. The IN Operator
      5. 7.5. Using a SELECT Statement for Logical Tests
      6. 7.6. Using Boolean Logic (AND, OR, and NOT Operators)
      7. 7.7. A Caution When Using Multiple OR Operators
      8. 7.8. The WHERE Statement
      9. 7.9. Some Useful WHERE Operators
      10. 7.10. Problems
    6. 8. Performing Iterative Processing: Looping
      1. 8.1. Introduction
      2. 8.2. DO Groups
      3. 8.3. The Sum Statement
      4. 8.4. The Iterative DO Loop
      5. 8.5. Other Forms of an Iterative DO Loop
      6. 8.6. DO WHILE and DO UNTIL Statements
      7. 8.7. A Caution When Using DO UNTIL Statements
      8. 8.8. LEAVE and CONTINUE Statements
      9. 8.9. Problems
    7. 9. Working with Dates
      1. 9.1. Introduction
      2. 9.2. How SAS Stores Dates
      3. 9.3. Reading Date Values from Raw Data
      4. 9.4. Computing the Number of Years between Two Dates
      5. 9.5. Demonstrating a Date Constant
      6. 9.6. Computing the Current Date
      7. 9.7. Extracting the Day of the Week, Day of the Month, Month, and Year from a SAS Date
      8. 9.8. Creating a SAS Date from Month, Day, and Year Values
      9. 9.9. Substituting the 15th of the Month when the Day Value Is Missing
      10. 9.10. Using Date Interval Functions
      11. 9.11. Problems
    8. 10. Subsetting and Combining SAS Data Sets
      1. 10.1. Introduction
      2. 10.2. Subsetting a SAS Data Set
      3. 10.3. Creating More Than One Subset Data Set in One DATA Step
      4. 10.4. Adding Observations to a SAS Data Set
      5. 10.5. Interleaving Data Sets
      6. 10.6. Combining Detail and Summary Data
      7. 10.7. Merging Two Data Sets
      8. 10.8. Omitting the BY Statement in a Merge
      9. 10.9. Controlling Observations in a Merged Data Set
      10. 10.10. More Uses for IN= Variables
      11. 10.11. When Does a DATA Step End?
      12. 10.12. Merging Two Data Sets with Different BY Variable Names
      13. 10.13. Merging Two Data Sets with Different BY Variable Data Types
      14. 10.14. One-to-One, One-to-Many, and Many-to-Many Merges
      15. 10.15. Updating a Master File from a Transaction File
      16. 10.16. Problems
    9. 11. Working with Numeric Functions
      1. 11.1. Introduction
      2. 11.2. Functions That Round and Truncate Numeric Values
      3. 11.3. Functions That Work with Missing Values
      4. 11.4. Setting Character and Numeric Values to Missing
      5. 11.5. Descriptive Statistics Functions
      6. 11.6. Computing Sums within an Observation
      7. 11.7. Mathematical Functions
      8. 11.8. Computing Some Useful Constants
      9. 11.9. Generating Random Numbers
      10. 11.10. Special Functions
      11. 11.11. Functions That Return Values from Previous Observations
      12. 11.12. Problems
    10. 12. Working with Character Functions
      1. 12.1. Introduction
      2. 12.2. Determining the Length of a Character Value
      3. 12.3. Changing the Case of Characters
      4. 12.4. Removing Characters from Strings
      5. 12.5. Joining Two or More Strings Together
      6. 12.6. Removing Leading or Trailing Blanks
      7. 12.7. Using the COMPRESS Function to Remove Characters from a String
      8. 12.8. Searching for Characters
      9. 12.9. Searching for Individual Characters
      10. 12.10. Searching for Words in a String
      11. 12.11. Searching for Character Classes
      12. 12.12. Using the NOT Functions for Data Cleaning
      13. 12.13. Describing a Real Blockbuster Data Cleaning Function
      14. 12.14. Extracting Part of a String
      15. 12.15. Dividing Strings into Words
      16. 12.16. Comparing Strings
      17. 12.17. Performing a Fuzzy Match
      18. 12.18. Substituting Characters or Words
      19. 12.19. Problems
    11. 13. Working with Arrays
      1. 13.1. Introduction
      2. 13.2. Setting Values of 999 to a SAS Missing Value for Several Numeric Variables
      3. 13.3. Setting Values of NA and ? to a Missing Character Value
      4. 13.4. Converting All Character Values to Lowercase
      5. 13.5. Using an Array to Create New Variables
      6. 13.6. Changing the Array Bounds
      7. 13.7. Temporary Arrays
      8. 13.8. Loading the Initial Values of a Temporary Array from a Raw Data File
      9. 13.9. Using a Multidimensional Array for Table Lookup
      10. 13.10. Problems
  8. 3. Presenting and Summarizing Your Data
    1. 14. Displaying Your Data
      1. 14.1. Introduction
      2. 14.2. The Basics
      3. 14.3. Changing the Appearance of Your Listing
      4. 14.4. Changing the Appearance of Values
      5. 14.5. Controlling the Observations That Appear in Your Listing
      6. 14.6. Adding Additional Titles and Footnotes to Your Listing
      7. 14.7. Changing the Order of Your Listing
      8. 14.8. Sorting by More Than One Variable
      9. 14.9. Labeling Your Column Headings
      10. 14.10. Adding Subtotals and Totals to Your Listing
      11. 14.11. Making Your Listing Easier to Read
      12. 14.12. Adding the Number of Observations to Your Listing
      13. 14.13. Double-Spacing Your Listing
      14. 14.14. Listing the First n Observations of Your Data Set
      15. 14.15. Problems
    2. 15. Creating Customized Reports
      1. 15.1. Introduction
      2. 15.2. Using PROC REPORT
      3. 15.3. Selecting the Variables to Include in Your Report
      4. 15.4. Comparing Detail and Summary Reports
      5. 15.5. Producing a Summary Report
      6. 15.6. Demonstrating the FLOW Option of PROC REPORT
      7. 15.7. Using Two Grouping Variables
      8. 15.8. Changing the Order of Variables in the COLUMN Statement
      9. 15.9. Changing the Order of Rows in a Report
      10. 15.10. Applying the ORDER Usage to Two Variables
      11. 15.11. Creating a Multi-Column Report
      12. 15.12. Producing Report Breaks
      13. 15.13. Using a Nonprinting Variable to Order a Report
      14. 15.14. Computing a New Variable with PROC REPORT
      15. 15.15. Computing a Character Variable in a COMPUTE Block
      16. 15.16. Creating an ACROSS Variable with PROC REPORT
      17. 15.17. Modifying the Column Label for an ACROSS Variable
      18. 15.18. Using an ACROSS Usage to Display Statistics
      19. 15.19. Problems
    3. 16. Summarizing Your Data
      1. 16.1. Introduction
      2. 16.2. PROC MEANS—Starting from the Beginning
      3. 16.3. Adding a BY Statement to PROC MEANS
      4. 16.4. Using a CLASS Statement with PROC MEANS
      5. 16.5. Applying a Format to a CLASS Variable
      6. 16.6. Deciding between a BY Statement and a CLASS Statement
      7. 16.7. Creating Summary Data Sets Using PROC MEANS
      8. 16.8. Outputting Other Descriptive Statistics with PROC MEANS
      9. 16.9. Asking SAS to Name the Variables in the Output Data Set
      10. 16.10. Outputting a Summary Data Set: Including a BY Statement
      11. 16.11. Outputting a Summary Data Set: Including a CLASS Statement
      12. 16.12. Using Two CLASS Variables with PROC MEANS
      13. 16.13. Selecting Different Statistics for Each Variable
      14. 16.14. Problems
    4. 17. Counting Frequencies
      1. 17.1. Introduction
      2. 17.2. Counting Frequencies
      3. 17.3. Selecting Variables for PROC FREQ
      4. 17.4. Using Formats to Label the Output
      5. 17.5. Using Formats to Group Values
      6. 17.6. Problems Grouping Values with PROC FREQ
      7. 17.7. Displaying Missing Values in the Frequency Table
      8. 17.8. Changing the Order of Values in PROC FREQ
      9. 17.9. Producing Two-Way Tables
      10. 17.10. Requesting Multiple Two-Way Tables
      11. 17.11. Producing Three-Way Tables
      12. 17.12. Problems
    5. 18. Creating Tabular Reports
      1. 18.1. Introduction
      2. 18.2. A Simple PROC TABULATE Table
      3. 18.3. Describing the Three PROC TABULATE Operators
        1. 18.3.1. Concatenation
        2. 18.3.2. Table Dimensions (Page, Row, and Column)
        3. 18.3.3. Nesting
      4. 18.4. Using the Keyword ALL
      5. 18.5. Producing Descriptive Statistics
      6. 18.6. Combining CLASS and Analysis Variables in a Table
      7. 18.7. Customizing Your Table
      8. 18.8. Demonstrating a More Complex Table
      9. 18.9. Computing Row and Column Percentages
      10. 18.10. Displaying Percentages in a Two-Dimensional Table
      11. 18.11. Computing Column Percentages
      12. 18.12. Computing Percentages on Numeric Variables
      13. 18.13. Understanding How Missing Values Affect PROC TABULATE Output
      14. 18.14. Problems
    6. 19. Introducing the Output Delivery System
      1. 19.1. Introduction
      2. 19.2. Sending SAS Output to an HTML File
      3. 19.3. Creating a Table of Contents
      4. 19.4. Selecting a Different HTML Style
      5. 19.5. Choosing Other ODS Destinations
      6. 19.6. Selecting or Excluding Portions of SAS Output
      7. 19.7. Sending Output to a SAS Data Set
      8. 19.8. Problems
    7. 20. Generating High-Quality Graphics
      1. 20.1. Introduction
      2. 20.2. Some Basic Concepts
      3. 20.3. Producing Simple Bar Charts Using PROC GCHART
      4. 20.4. Creating Pie Charts
      5. 20.5. Creating Bar Charts for a Continuous Variable
      6. 20.6. Creating Charts with Values Representing Categories
      7. 20.7. Creating Bar Charts Representing Sums
      8. 20.8. Creating Bar Charts Representing Means
      9. 20.9. Adding Another Variable to the Chart
      10. 20.10. Producing Scatter Plots
      11. 20.11. Connecting Points
      12. 20.12. Connecting Points with a Smooth Line
      13. 20.13. Problems
  9. 4. Advanced Topics
    1. 21. Using Advanced INPUT Techniques
      1. 21.1. Introduction
      2. 21.2. Handling Missing Values at the End of a Line
      3. 21.3. Reading Short Data Lines
      4. 21.4. Reading External Files with Lines Longer Than 256 Characters
      5. 21.5. Detecting the End of the File
      6. 21.6. Reading a Portion of a Raw Data File
      7. 21.7. Reading Data from Multiple Files
      8. 21.8. Reading Data from Multiple Files Using a FILENAME Statement
      9. 21.9. Reading External Filenames from a Data File
      10. 21.10. Reading Multiple Lines of Data to Form One Observation
      11. 21.11. Reading Data Conditionally (the Single Trailing @ Sign)
      12. 21.12. More Examples of the Single Trailing @ Sign
      13. 21.13. Creating Multiple Observations from One Line of Input
      14. 21.14. Using Variable and Informat Lists
      15. 21.15. Using Relative Column Pointers to Read a Complex Data Structure Efficiently
      16. 21.16. Problems
    2. 22. Using Advanced Features of User-Defined Formats and Informats
      1. 22.1. Introduction
      2. 22.2. Using Formats to Recode Variables
      3. 22.3. Using Formats with a PUT Function to Create New Variables
      4. 22.4. Creating User-Defined Informats
      5. 22.5. Reading Character and Numeric Data in One Step
      6. 22.6. Using Formats (and Informats) to Perform Table Lookup
      7. 22.7. Using a SAS Data Set to Create a Format
      8. 22.8. Updating and Maintaining Your Formats
      9. 22.9. Using Formats within Formats
      10. 22.10. Multilabel Formats
      11. 22.11. Using the INPUTN Function to Perform a More Complicated Table Lookup
      12. 22.12. Problems
    3. 23. Restructuring SAS Data Sets
      1. 23.1. Introduction
      2. 23.2. Converting a Data Set with One Observation per Subject to a Data Set with Several Observations per Subject: Using a DATA Step
      3. 23.3. Converting a Data Set with Several Observations per Subject to a Data Set with One Observation per Subject: Using a DATA Step
      4. 23.4. Converting a Data Set with One Observation per Subject to a Data Set with Several Observations per Subject: Using PROC TRANSPOSE
      5. 23.5. Converting a Data Set with Several Observations per Subject to a Data Set with One Observation per Subject: Using PROC TRANSPOSE
      6. 23.6. Problems
    4. 24. Working with Multiple Observations per Subject
      1. 24.1. Introduction
      2. 24.2. Identifying the First or Last Observation in a Group
      3. 24.3. Counting the Number of Visits Using PROC FREQ
      4. 24.4. Counting the Number of Visits Using PROC MEANS
      5. 24.5. Computing Differences between Observations
      6. 24.6. Computing Differences between the First and Last Observation in a BY Group Using the LAG Function
      7. 24.7. Computing Differences between the First and Last Observation in a BY Group Using a RETAIN Statement
      8. 24.8. Using a Retained Variable to “Remember” a Previous Value
      9. 24.9. Problems
    5. 25. Introducing the SAS Macro Language
      1. 25.1. Introduction
      2. 25.2. Macro Variables: What Are They?
      3. 25.3. Some Built-In Macro Variables
      4. 25.4. Assigning Values to Macro Variables with a %LET Statement
      5. 25.5. Demonstrating a Simple Macro
      6. 25.6. A Word about Tokens
      7. 25.7. Another Example of Using a Macro Variable as a Prefix
      8. 25.8. Using a Macro Variable to Transfer a Value between DATA Steps
      9. 25.9. Problems
    6. 26. Introducing the Structured Query Language
      1. 26.1. Introduction
      2. 26.2. Some Basics
      3. 26.3. Joining Two Tables (Merge)
      4. 26.4. Left, Right, and Full Joins
      5. 26.5. Concatenating Data Sets
      6. 26.6. Using Summary Functions
      7. 26.7. Demonstrating an ORDER Clause
      8. 26.8. An Example of Fuzzy Matching
      9. 26.9. Problems
  10. Solutions to Odd-Numbered Problems
    1. Chapter 1 Solutions
    2. Chapter 2 Solutions
    3. Chapter 3 Solutions
    4. Chapter 4 Solutions
    5. Chapter 5 Solutions
    6. Chapter 6 Solutions
    7. Chapter 7 Solutions
    8. Chapter 8 Solutions
    9. Chapter 9 Solutions
    10. Chapter 10 Solutions
    11. Chapter 11 Solutions
    12. Chapter 12 Solutions
    13. Chapter 13 Solutions
    14. Chapter 14 Solutions
    15. Chapter 15 Solutions
    16. Chapter 16 Solutions
    17. Chapter 17 Solutions
    18. Chapter 18 Solutions
    19. Chapter 19 Solutions
    20. Chapter 20 Solutions
    21. Chapter 21 Solutions
    22. Chapter 22 Solutions
    23. Chapter 23 Solutions
    24. Chapter 24 Solutions
    25. Chapter 25 Solutions
    26. Chapter 26 Solutions
  11. Books Available from SAS Press
    1. JMP® Books