Using PROC MEANS, PROC TABULATE, and PROC UNIVARIATE to Look for Outliers

One of the simplest ways to check for invalid numeric values is to run either PROC MEANS or PROC UNIVARIATE. By default, PROC MEANS lists the minimum and maximum values, along with the n, mean, and standard deviation. PROC UNIVARIATE is somewhat more useful in detecting invalid values, because it provides you with a listing of the five highest and five lowest values, along with graphical output (stem-and-leaf plots and box plots). Let’s first look at how you can use PROC MEANS for very simple checking of numeric variables. The program below checks the three numeric variables, heart rate (HR), systolic blood pressure (SBP), and diastolic blood pressure (DBP), in the PATIENTS ...

Get Cody’s Data Cleaning Techniques Using SAS® Software now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.