Selecting Patients with Duplicate Observations by Using a Macro List and SQL

Another quick, easy, and efficient way to select observations with duplicate ID’s is to create a macro variable that contains all the patient ID’s in the duplicate data set (DUP_NO). Using a short SQL step, you can create a list of patient numbers, separated by spaces or commas (both will work) and placed in quotes, that can subsequently be used as the argument in an IN statement. Program 5-9 demonstrates this.

Program 5-9. Using PROC SQL to Create a List of Duplicates
PROC SQL NOPRINT;
   SELECT QUOTE(PATNO)  1
      INTO :DUP_LIST SEPARATED BY " "  2
      FROM DUP_NO;
QUIT;


PROC PRINT DATA=CLEAN.PATIENTS;
   WHERE PATNO IN (&DUP_LIST);  3 TITLE "Duplicates Selected Using SQL and a Macro ...

Get Cody’s Data Cleaning Techniques Using SAS® Software now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.