A Simple Example of a Validation Data Set

Let’s start with a simple validation data set that only handles range checking for numeric variables. The following three rules are used for this example.

Valid heart rate values are between 40 and 100.

Valid values for systolic blood pressure are between 80 and 200.

Valid values for diastolic blood pressure are between 60 and 120.

For this simple example, all missing values are treated as invalid. Program 9-3 is generalized to treat missing values as either valid or invalid for each numeric variable.

For this example, the validation data set contains the variable name, the minimum valid value, and the maximum valid value.

Program 9-1 creates a validation data set called VALID.

Program 9-1. Creating ...

Get Cody’s Data Cleaning Techniques Using SAS® Software now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.