Chapter 9: Data Cleaning Tasks

Introduction

Task: Looking for possible data errors using a given range

Task: Demonstrating a macro to report on outliers using fixed ranges

Task: Demonstrating a macro that performs automatic outlier detection

How the macro works

Conclusion

Introduction

One of the first tasks facing a data analyst is to check for possible invalid numeric values. For some variables, you can decide if a value might be an error if it is outside a given range. For example, a value for resting heart rate would be unusual if it were below 40 or above 100.

Another approach for identifying possible numeric outliers is to see if a given data value does not seem to “belong” with the other values. A common approach is to compute a mean ...

Get Cody's Collection of Popular SAS Programming Tasks and How to Tackle Them now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.