Finding the right dataset

We need a good historical dataset to build our model. We will mine this dataset to build our model. To continue with the example to our fictitious company, Furnitica, we will use historical campaign response data from a previous campaign run by Furnitica.

This is synthetic data, which means it has been synthesized using a random data generation algorithm. A few sample rows in our dataset are presented in Table 2:

Age

Income

Gender

Folder

Response

61

30974

0

1

0

42

38260

0

3

0

40

20135

0

4

0

88

30645

0

5

0

58

38078

1

3

0

73

20445

0

4

0

34

66198

0

3

0

65

48657

0

2

0

68

39309

0

1

0

Table 2 Sample credit card approval data

This dataset is generated from the response ...

Get Hadoop Blueprints now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.