Checking for Invalid Character Values

Let’s start with checking for invalid character values. For these examples, let’s use the SAS data set PATIENTS (see the Appendix for the program and data file), and look for invalid values for GENDER, DX, and AE. In Program 8-2, missing values are reported as invalid for GENDER and AE. Later, the program is modified so that missing values are not reported.

Program 8-2. Using SQL to Look for Invalid Character Values
LIBNAME CLEAN "C:\CLEANING";
***Checking for invalid character data;
PROC SQL;
   TITLE "Checking for Invalid Character Data";
   SELECT PATNO,
          GENDER,
          DX,
          AE
   FROM CLEAN.PATIENTS
   WHERE GENDER NOT IN ('M','F')           OR
         VERIFY(DX,'0123456789 ') NE 0     OR
         AE NOT IN ('0','1');
QUIT;

Because there is no CREATE ...

Get Cody’s Data Cleaning Techniques Using SAS® Software now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.