Data Warehousing, OLAP, Analytic Sandboxes, and Data Mining
The final chapters of this book cover the topic of data, a topic that has been present in all the earlier chapters, lurking under the surface of discussions about methodology and techniques. As a bad pun to emphasize the importance of data, one could say that “data” is data mining's first name. This chapter puts data into the larger context of decision support systems. The remaining chapters zoom in for a closer look at methods for transforming data to make records more suitable for data mining, at clever ways for bringing important information to the surface, and at external sources of data.
Since the introduction of computers into data processing centers a few decades ago, just about every operational system in business has been computerized, spewing out large amounts of data along the way, and data mining is one way of making sense out of the deluge of data. Automation has changed how people do business and how we live: online retailing, social networking, automated tellers, adjustable rate mortgages, just-in-time inventory control, credit cards, Google, overnight deliveries, and frequent flier/buyer clubs are a few examples of how computer-based automation has opened new markets and revolutionized existing ones. Automation has also created immense amounts of data at the companies that profit from these activities. Data accumulates, but not information — and not the right information at the right time.