Identifying Subjects with “n” Observations Each (DATA Step Approach)

Besides identifying duplicates, you may need to verify that there are “n” observations per subject in a raw data file or in a SAS data set. For example, if each patient in a clinical trial was seen twice, you might want to verify that there are two observations for each patient ID in the file or data set. You can accomplish this task by using a DATA step approach or by using PROC FREQ, the same two methods used earlier to detect duplicates. First, let’s look at the DATA step approach.

The key to this approach is the use of the variables FIRST, and LAST., which are created when you use a SET statement followed by a BY statement. To test the programs, let’s use the data set PATIENTS2, ...

Get Cody’s Data Cleaning Techniques Using SAS® Software now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.