Part II: Data Quality—Profiling and Improvement

Introduction

This second part of the book shows methods you can use to profile and improve the quality of the data.

Profiling

The methods for profiling focus primarily on advanced features of the data like:

the structure of missing values in a one-row-per-subject data mart

the structure of missing values in a time series data mart

the fact that observations in time series data are missing

the detection of complex outliers like multivariate outliers or outliers in time series data

the detection of duplicate records in the data based on matching algorithms

Methods for simple data profiling and validation are only briefly mentioned and the reader is directed to the respective references. ...

Get Data Quality for Analytics Using SAS now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.