Chapter 9

Preparing Data

In This Chapter

arrow Documenting your business objectives

arrow Processing your data

arrow Sampling your data

arrow Transforming your data

The roadmap to building a successful predictive model involves defining business objectives, preparing the data, and then building and deploying the model. This chapter delves into data preparation, which involves:

  • Acquiring the data
  • Exploring the data
  • Cleaning the data
  • Selecting variables of interest
  • Generating derived variables
  • Extracting, loading, and transforming the data
  • Sampling the data into training and test datasets

Data is a four-letter word. It’s amazing that such a small word can describe trillions of gigabytes of information: customer names, addresses, products, discounted versus original prices, store codes, times of purchase, supplier locations, run rates for print advertising, the color of your delivery vans. And that’s just for openers. Data is, or can be, literally everything.

Not every source or type of data will be relevant to the business question you’re trying to answer. Predictive analytics models are built from multiple ...

Get Predictive Analytics For Dummies now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.