Using PROC COMPARE with Two Data Sets That Have an Unequal Number of Observations

The ID statement is especially useful when the two data sets do not contain the same number of observations or when there is a discrepancy between the values of the ID variables. To see how PROC COMPARE treats this problem, look at the two new files (FILE_1B and FILE_2B) shown next. A new patient number (005) has been added to FILE_1 to make FILE_1B, and patient number 004 has been omitted from FILE2 to make FILE_2B.

FILE_1B


001M10211946130 80
002F12201950110 70
003M09141956140 90
004F10101960180100
005M01041930166 88
007m10321940184110


FILE_2B


001M1021194613080
002F12201950110 70
003M09141956144 90
007M10231940184110

The two SAS data sets (ONE_B and TWO_B) are ...

Get Cody’s Data Cleaning Techniques Using SAS® Software now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.