Introduction

The techniques for checking for invalid numeric data are quite different from the techniques that you saw in the last chapter for checking character data. Although there are usually many different values a numeric variable can take on, there are several techniques that you can use to help identify data errors. One simple technique is to examine some of the largest and smallest data values for each numeric variable. If you see values such as 12 or 1200 for a systolic blood pressure (usually between 80 and 200 in healthy adults), you can be quite certain that an error was made, either in entering the data values or on the original data collection form.

There are also some internal consistency methods that can be used to identify possible ...

Get Cody’s Data Cleaning Techniques Using SAS® Software now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.