Chapter 6: Data Preparation

6.1 Overview

6.2 Transactional Data Versus Time Series Data

6.3 Matching Frequencies

6.3.1 Contracting

6.3.2 Expanding

6.4 Merging

6.5 Imputation

6.6 Outliers

6.7 Transformations

6.8 Summary

6.1 Overview

As might be expected, when data is received to be used in data mining for forecasting, it is not necessarily in the right format or clean enough for analysis. Almost always, the Y variable of interest comes from a different data source or system than the exogenous or X data. More often than not, the Y data might be in what is called a transactional format, which is typical of most ERP systems. This transaction data has to be converted to time series data. Generally, the exogenous data comes from a source that is ...

Get Applied Data Mining for Forecasting Using SAS now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.