You are previewing Data Quality for Analytics Using SAS.
O'Reilly logo
Data Quality for Analytics Using SAS

Book Description

Analytics offers many capabilities and options to measure and improve data quality, and SAS is perfectly suited to these tasks. Gerhard Svolba's Data Quality for Analytics Using SAS focuses on selecting the right data sources and ensuring data quantity, relevancy, and completeness. The book is made up of three parts. The first part, which is conceptual, defines data quality and contains text, definitions, explanations, and examples. The second part shows how the data quality status can be profiled and the ways that data quality can be improved with analytical methods. The final part details the consequences of poor data quality for predictive modeling and time series forecasting. With this book you will learn how you can use SAS to perform advanced profiling of data quality status and how SAS can help improve your data quality.

Table of Contents

  1. Copyright
  2. Dedication
  3. Acknowledgements
  4. Contents
  5. Introduction
  6. Part I Data Quality Defined
    1. Chapter 1 Introductory Case Studies
    2. Chapter 2 Definition and Scope of Data Quality for Analytics
    3. Chapter 3 Data Availability
    4. Chapter 4 Data Quantity
    5. Chapter 5 Data Completeness
    6. Chapter 6 Data Correctness
    7. Chapter 7 Predictive Modeling
    8. Chapter 8 Analytics for Data Quality
    9. Chapter 9 Process Considerations for Data Quality
  7. Part II Data Quality—Profiling and Improvement
    1. Chapter 10 Profiling and Imputation of Missing Values
    2. Chapter 11 Profiling and Replacement of Missing Data in a Time Series
    3. Chapter 12 Data Quality Control across Related Tables
    4. Chapter 13 Data Quality with Analytics
    5. Chapter 14 Data Quality Profiling and Improvement with SAS Analytic Tools
  8. Part III Consequences of Poor Data Quality—Simulation Studies
    1. Chapter 15 Introdution to Simulation Studies
    2. Chapter 16 Simulating the Consequences of Poor Data Quality for Predictive Modeling
    3. Chapter 17 Influence of Data Quality and Data Availability on Model Quality in Predictive Modeling
    4. Chapter 18 Influence of Data Completeness on Model Quality in Predictive Modeling
    5. Chapter 19 Influence of Data Correctness on Model Quality in Predictive Modeling
    6. Chapter 20 Simulating the Consequences of Poor Data Quality in Time Series Forecasting
    7. Chapter 21 Consequences of Data Quantity and Data Completeness in Time Series Forecasting
    8. Chapter 22 Consequences of Random Disturbances in Time Series Data
    9. Chapter 23 Consequences of Systematic Disturbances in Time Series Data
  9. Appendix A: Macro Code
  10. Appendix B: General SAS Content and Programs
  11. Appendix C: Using SAS Enterprise Miner for Simulation Studies
  12. Appendix D: Macro to Determine the Optimal Length of the Available Data History
  13. Appendix E: A Short Overview on Data Structures and Analytic Data Preparation
  14. References
  15. Index