You are previewing Elementary Statistics Using SAS.
O'Reilly logo
Elementary Statistics Using SAS

Book Description

Bridging the gap between statistics texts and SAS documentation, Elementary Statistics Using SAS is written for those who want to perform analyses to solve problems. The first section of the book explains the basics of SAS data sets and shows how to use SAS for descriptive statistics and graphs. The second section discusses fundamental statistical concepts, including normality and hypothesis testing. The remaining sections of the book show analyses for comparing two groups, comparing multiple groups, fitting regression equations, and exploring contingency tables. For each analysis, author Sandra Schlotzhauer explains assumptions, statistical approach, and SAS methods and syntax, and makes conclusions from the results. Statistical methods covered include two-sample t-tests, paired-difference t-tests, analysis of variance, multiple comparison techniques, regression, regression diagnostics, and chi-square tests. Elementary Statistics Using SAS is a thoroughly revised and updated edition of Ramon Littell and Sandra Schlotzhauer's SAS System for Elementary Statistical Analysis.

Table of Contents

  1. Copyright
  2. Acknowledgments
  3. 1. The Basics
    1. 1. Getting Started
      1. Introducing This Book
        1. Purpose
        2. Audience
        3. What This Book Is and Isn't
        4. How This Book Is Organized
          1. Using Part 1
          2. Using Part 2
          3. Using Part 3
          4. Using Part 4
          5. Using Part 5
          6. References
        5. Typographic Conventions
        6. How to Use This Book
      2. Introducing SAS Software
        1. Working with Your Computer
        2. Understanding the Ways of Using SAS
        3. Identifying Your SAS Release
        4. Identifying Your On-Site SAS Support Personnel
      3. Summarizing the Structure of SAS Software
        1. Syntax and Spacing Conventions
          1. Syntax
          2. Spacing
        2. Output Produced by SAS Software
      4. Starting SAS
      5. Displaying a Simple Example
      6. Getting Help
      7. Exiting SAS
      8. Introducing Several SAS Statements
        1. The TITLE Statement
        2. The FOOTNOTE Statement
        3. The RUN Statement
        4. The OPTIONS Statement
      9. Summary
        1. Key Ideas
        2. Syntax
        3. Example
    2. 2. Creating SAS Data Sets
      1. What Is a SAS Data Set?
      2. Understanding the SAS DATA Step
        1. Summarizing the SAS DATA Step
        2. Assigning Names
        3. Task 1: The DATA Statement
        4. Task 2: The INPUT Statement
          1. Identifying Missing Values
          2. Omitting the Column Location for Variables
          3. Putting Several Short Observations on One Line
        5. Task 3: The DATALINES Statement
          1. Existing Data Lines: Using the INFILE Statement
        6. Task 4: The Data Lines
        7. Task 5: The Null Statement
        8. Task 6: The RUN Statement
      3. Creating the Speeding Ticket Data Set
      4. Printing a Data Set
        1. Printing Only Some of the Variables
        2. Suppressing the Observation Number
        3. Adding Blank Lines between Observations
        4. Summarizing PROC PRINT
      5. Sorting a Data Set
      6. Summary
        1. Key Ideas
        2. Syntax
        3. Example
      7. Special Topics
        1. Labeling Variables
        2. Formatting Values of Variables
          1. Using an Existing SAS Format
          2. Using a SAS Function
          3. Creating Your Own Formats
        3. Combining Labeling and Formatting
        4. Syntax
    3. 3. Importing Data
      1. Opening an Existing SAS Data Set
      2. Reading Data from an Existing Text File
      3. Importing Microsoft Excel Spreadsheets
      4. Using the Import Wizard on a PC
      5. Introducing Advanced Features
      6. Summary
        1. Key Ideas
        2. Syntax
    4. 4. Summarizing Data
      1. Understanding Types of Variables
        1. Levels of Measurement
        2. Types of Response Scales
        3. Relating Levels of Measurement and Response Scales
      2. Summarizing a Continuous Variable
        1. Reviewing the Moments Table
        2. Reviewing the Basic Statistical Measures Table
        3. Reviewing the Quantiles Table
        4. Reviewing the Extreme Observations Table
        5. Reviewing the Missing Values Table
        6. Reviewing Other Tables
        7. Adding a Summary of Distinct Extreme Values
        8. Using PROC MEANS for a Brief Summary
      3. Creating Line Printer Plots for Continuous Variables
        1. Understanding the Stem-and-Leaf Plot
        2. Understanding the Box Plot
      4. Creating Histograms for Continuous Variables
      5. Creating Frequency Tables for All Variables
        1. Using PROC UNIVARIATE
        2. Using PROC FREQ
        3. Missing Values in PROC FREQ
      6. Creating Bar Charts for All Variables
        1. Specifying Graphics Options
        2. Simple Vertical Bar Charts with PROC GCHART
        3. Simple Vertical Bar Charts with PROC CHART
        4. Discrete Vertical Bar Charts with PROC GCHART
        5. Simple Horizontal Bar Charts with PROC GCHART
        6. Omitting Statistics in Horizontal Bar Charts
        7. Creating Ordered Bar Charts
      7. Checking Data for Errors
        1. Checking for Errors in Continuous Variables
        2. Checking for Errors in Nominal or Ordinal Variables
      8. Summary
        1. Key Ideas
        2. Syntax
        3. Example
      9. Special Topic: Using ODS to Control Output Tables
        1. Finding ODS Table Names
        2. Introducing Other Features
  4. 2. Statistical Background
    1. 5. Understanding Fundamental Statistical Concepts
      1. Populations and Samples
        1. Definitions
        2. Random Samples
        3. Describing Parameters and Statistics
      2. The Normal Distribution
        1. Definition and Properties
        2. The Empirical Rule
      3. Parametric and Nonparametric Statistical Methods
      4. Testing for Normality
        1. Statistical Test for Normality
        2. Other Methods of Checking for Normality
          1. Skewness and Kurtosis
          2. Stem-and-Leaf Plot
          3. Box Plot
          4. Histogram
          5. Normal Probability Plot
          6. Summarizing Conclusions
        3. Identifying ODS Tables
        4. Rechecking the Falcons Data
          1. Understanding the WHERE Statement
      5. Building a Hypothesis Test
      6. Statistical and Practical Significance
        1. Statistical Significance
          1. Choosing a Significance Level
          2. More on p-values
          3. Another Type of Error
        2. Practical Significance
        3. Example
      7. Summary
        1. Key Ideas
        2. Syntax
        3. Example
    2. 6. Estimating the Mean
      1. Using One Number to Estimate the Mean
      2. Effect of Sample Size
        1. Reducing the Sample Size
      3. Effect of Population Variability
        1. Estimation with a Smaller Population Standard Deviation
      4. The Distribution of Sample Averages
        1. The Central Limit Theorem
          1. How Big Is Large?
          2. The Standard Error of the Mean
        2. The Empirical Rule and the Central Limit Theorem
      5. Getting Confidence Intervals for the Mean
        1. Changing the Confidence Level
        2. Identifying ODS Tables
      6. Summary
        1. Key Ideas
        2. Syntax
        3. Example
  5. 3. Comparing Groups
    1. 7. Comparing Paired Groups
      1. Deciding between Independent and Paired Groups
        1. Independent Groups
        2. Paired Groups
      2. Summarizing Data from Paired Groups
        1. Finding the Differences between Paired Groups
        2. Summarizing Differences with PROC UNIVARIATE
        3. Summarizing Differences with Other Procedures
      3. Building Hypothesis Tests to Compare Paired Groups
        1. Deciding Which Statistical Test to Use
        2. Understanding Significance
          1. Groups Significantly Different
          2. Groups Not Significantly Different
      4. Performing the Paired-Difference t-test
        1. Assumptions
        2. Using PROC UNIVARIATE to Test Paired Differences
          1. Finding the p-value
        3. Using PROC TTEST to Test Paired Differences
          1. Finding the p-value
          2. Understanding Other Items in the Output
        4. Identifying ODS Tables
      5. Performing the Wilcoxon Signed Rank Test
        1. Finding the p-value
        2. Understanding Other Items in the Output
        3. Identifying ODS Tables
      6. Summary
        1. Key Ideas
        2. Syntax
        3. Example
    2. 8. Comparing Two Independent Groups
      1. Deciding between Independent and Paired Groups
      2. Summarizing Data
        1. Using PROC MEANS for a Concise Summary
        2. Using PROC UNIVARIATE for a Detailed Summary
        3. Adding Comparative Histograms to PROC UNIVARIATE
        4. Using PROC CHART for Side-by-Side Bar Charts
        5. Using PROC BOXPLOT for Side-by-Side Box Plots
      3. Building Hypothesis Tests to Compare Two Independent Groups
        1. Deciding Which Statistical Test to Use
        2. Understanding Significance
          1. Groups Significantly Different
          2. Groups Not Significantly Different
      4. Performing the Two-Sample t-test
        1. Assumptions
        2. Testing for Equal Variances
        3. Testing to Compare Two Means
          1. Finding the p-value
          2. Understanding Information in the Output
        4. Changing the Alpha Level for Confidence Intervals
      5. Performing the Wilcoxon Rank Sum Test
        1. Using PROC NPAR1WAY for the Wilcoxon Rank Sum Test
          1. Finding the p-value
          2. Understanding Tables in the Output
      6. Summary
        1. Key Ideas
        2. Syntax
        3. Example
    3. 9. Comparing More Than Two Groups
      1. Summarizing Data from Multiple Groups
        1. Creating Comparative Histograms for Multiple Groups
        2. Creating Side-by-Side Box Plots for Multiple Groups
      2. Building Hypothesis Tests to Compare More Than Two Groups
        1. Using Parametric and Nonparametric Tests
        2. Balanced and Unbalanced Data
        3. Understanding Significance
          1. Groups Significantly Different
          2. Groups Not Significantly Different
      3. Performing a One-Way ANOVA
        1. Understanding Assumptions
        2. Performing the Analysis of Variance
        3. Understanding Results
          1. Finding the p-value
          2. Understanding the First Page of Output
          3. Understanding the Second Page of Output
      4. Analysis of Variance with Unequal Variances
        1. Testing for Equal Variances
        2. Performing the Welch ANOVA
      5. Summarizing PROC ANOVA
      6. Performing a Kruskal-Wallis Test
        1. Assumptions
        2. Using PROC NPAR1WAY
        3. Understanding Results
          1. Finding the p-value
          2. Understanding Other Items in the Tables
        4. Summarizing PROC NPAR1WAY
      7. Understanding Multiple Comparison Procedures
        1. Performing Pairwise Comparisons with Multiple t-Tests
          1. Deciding Which Means Differ
          2. Understanding Other Items in the Report
        2. Using the Bonferroni Approach
          1. Deciding Which Means Differ
          2. Understanding Other Items in the Report
        3. Performing the Tukey-Kramer Test
          1. Deciding Which Means Differ
          2. Understanding Other Items in the Report
        4. Changing the Alpha Level
          1. Deciding Which Means Differ
        5. Using Dunnett's Test When Appropriate
          1. Deciding Which Means Differ
          2. Understanding Other Items in the Report
        6. Recommendations
        7. Summarizing Multiple Comparison Procedures
          1. Using PROC ANOVA Interactively
      8. Summarizing with an Example
          1. Step 1: Create a SAS data set.
          2. Step 2: Check the data set for errors.
          3. Step 3: Choose the significance level for the test.
          4. Step 4: Check the assumptions for the test.
          5. Step 5: Perform the test.
          6. Step 6: Make conclusions from the test results.
      9. Summary
        1. Key Ideas
        2. Syntax
        3. Example
  6. 4. Fitting Lines to Data
    1. 10. Understanding Correlation and Regression
      1. Summarizing Multiple Continuous Variables
        1. Creating Scatter Plots
          1. Using ODS Statistical Graphics
        2. Creating a Scatter Plot Matrix
        3. Using Summary Statistics to Check Data for Errors
        4. Reviewing PROC CORR Syntax for Summarizing Variables
      2. Calculating Correlation Coefficients
        1. Understanding Correlation Coefficients
        2. Understanding Tests for Correlation Coefficients
        3. Working with Missing Values
        4. Reviewing PROC CORR Syntax for Correlations
        5. Cautions about Correlations
        6. Questions Not Answered by Correlation
      3. Performing Straight-Line Regression
        1. Understanding Least Squares Regression
        2. Explaining Regression Equations
        3. Assumptions for Least Squares Regression
        4. Steps in Fitting a Straight Line
      4. Fitting a Straight Line with PROC REG
        1. Finding the Equation for the Fitted Straight Line
        2. Understanding the Parameter Estimates Table
        3. Understanding the Fit Statistics Table
        4. Understanding the Analysis of Variance Table
        5. Understanding Other Items in the Results
        6. Using PROC REG Interactively
        7. Printing Predicted Values and Limits
          1. Defining Prediction Limits
          2. Defining Confidence Limits for the Mean
          3. Using PROC REG to Print Predicted Values and Limits
        8. Plotting Predicted Values and Limits
          1. Using ODS Graphics
          2. Using Traditional Graphics with the PLOT Statement
        9. Summarizing Straight-Line Regression
      5. Fitting Curves
        1. Understanding Polynomial Regression
        2. Fitting Curves with PROC REG
          1. Checking the Assumptions for Regression
          2. Performing the Analysis
        3. Understanding Results for Fitting a Curve
        4. Printing Predicted Values and Limits
        5. Changing the Alpha Level
        6. Plotting Predicted Values and Limits
          1. Using ODS Graphics
          2. Using Traditional Graphics with the PLOT Statement
        7. Summarizing Polynomial Regression
      6. Regression for Multiple Independent Variables
        1. Understanding Multiple Regression
        2. Fitting Multiple Regression Models in SAS
        3. Understanding Results for Multiple Regression
          1. Finding the Regression Equation
          2. Testing for Significance of Parameters
          3. Checking the Fit of the Model
        4. Printing Predicted Values and Limits
        5. Summarizing Multiple Regression
      7. Summary
        1. Key Ideas
        2. Syntax
        3. Example
      8. Special Topic: Line Printer Plots
    2. 11. Performing Basic Regression Diagnostics
      1. Concepts in Plotting Residuals
        1. Plotting Residuals against Predicted Values
        2. Residuals, Predicted Values, and Outlier Points
        3. Plotting Residuals against Independent Variables
        4. Plotting Residuals in Time Sequence
      2. Creating Residuals Plots for the Kilowatt Data
        1. Residuals Plots for Straight-Line Regression
          1. Plotting Residuals against Independent Variables
          2. Plotting Residuals against Predicted Values
          3. Plotting Residuals in Time Sequence
          4. Summarizing Residuals Plots for the Straight-Line Model
        2. Residuals Plots for Multiple Regression
          1. Plotting Residuals against Independent Variables
          2. Plotting Residuals against Predicted Values
          3. Plotting Residuals in Time Sequence
          4. Summarizing Residuals Plots for Multiple Regression
      3. Creating Residuals Plots for the Engine Data
        1. Residuals Plots for Straight-Line Regression
          1. Plotting Residuals against the Independent Variable
          2. Plotting Residuals against Predicted Values
          3. Plotting Residuals in Time Sequence
          4. Summarizing Residuals Plots for Straight-Line Fit
        2. Residuals Plots for Fitting a Curve
          1. Plotting Residuals against the Independent Variable
          2. Plotting Residuals against Predicted Values
          3. Plotting Residuals in Time Sequence
          4. Summarizing Residuals Plots for Fitting a Curve to Engine Data
      4. Looking for Outliers in the Data
        1. Data with Outliers (Engine)
        2. Data without Outliers (Kilowatt)
      5. Investigating Lack of Fit
        1. Concepts for Lack of Fit
        2. Checking Lack of Fit for the Kilowatt Data
          1. Straight-Line Regression
          2. Multiple Regression
        3. Checking Lack of Fit for the Engine Data
          1. Straight-Line Regression
          2. Fitting a Curve
      6. Testing the Regression Assumption for Errors
        1. Checking Normality of Errors for the Kilowatt Data
        2. Checking Normality of Errors for the Engine Data
      7. Summary
        1. Key Ideas
        2. Syntax
        3. Example
      8. Special Topic: Creating Diagnostic Plots with Traditional Graphics and Line Printer Plots
        1. Traditional Graphics
        2. Line Printer Plots
      9. Special Topic: Automatic ODS Graphics
  7. 5. Data in Summary Tables
    1. 12. Creating and Analyzing Contingency Tables
      1. Defining Contingency Tables
      2. Summarizing Raw Data in Tables
        1. Understanding the Results
        2. Suppressing Statistics
        3. Summarizing PROC FREQ for Tables from Raw Data
      3. Creating Contingency Tables from an Existing Summary Table
        1. Summarizing PROC FREQ for Tables from Summary Data
      4. Creating Contingency Tables for Several Variables
        1. Printing Only One Table per Page
      5. Performing Tests for Independence
        1. Understanding Chi-Square Test Results
          1. Viewing Expected Cell Frequencies
        2. Understanding Fisher's Exact Test Results
      6. Creating Measures of Association with Ordinal Variables
        1. Understanding the Results
          1. Finding the p-value
          2. Understanding the Measures of Association
        2. Changing the Confidence Level
      7. Summarizing Analyses with PROC FREQ
      8. Summary
        1. Key Ideas
        2. Syntax
        3. Example
      9. Special Topic: ODS Statistical Graphics
  8. 1. Further Reading
      1. Statistics References
      2. SAS Press Books and SAS Documentation
  9. 2. Summary of SAS Elements and Chapters
  10. 3. Introducing the SAS Windowing Environment
    1. Viewing Initial Windows
    2. Creating a SAS Program
      1. Viewing the Program Editor Window
      2. Copying, Cutting, and Pasting
    3. Submitting a Program
      1. Recalling a Submitted Program
    4. Saving, Including, and Printing Programs
    5. Printing and Saving Output
      1. Saving SAS Tables
      2. Saving or Printing Graphs
  11. 4. Overview of SAS Enterprise Guide
    1. Purpose and Audience for SAS Enterprise Guide
    2. Summary of Software Structure
    3. Overview of Key Features
    4. References