10.4. The Data Mining Process

There are probably as many ways to approach data mining as there are data mining practitioners. Much like dimensional modeling, starting with the goal of adding business value leads to a clear series of steps that just make sense.

You'll be shocked to hear that our data mining process begins with an understanding of the business opportunities. Figure 10.3 shows the three major phases of the data mining process and the major task areas within those phases.

ADDITIONAL INFORMATION ON THE DATA MINING PROCESS

We didn't invent this process; we just stumbled on it through trial and error. Others who've spent their careers entirely on data mining have arrived at similar approaches to data mining. We're fortunate that they have documented their processes in detail in their own publications. In particular, three sources have been valuable to us. The book Data Mining Techniques, 2nd Ed. by Michael J. A. Berry and Gordon S. Linoff (Wiley, 2004) describes a process Berry and Linoff call the Virtuous Cycle of Data Mining. Another, similar approach comes from a special interest group that was formed in the late 1990s to define a data mining process. The result was published as Cross Industry Standard Process for Data Mining (CRISP). Visit www.crisp-dm.org for more information. Also, the SQL Server Books Online topic "The Data Mining Process" presents a similar approach.

Figure 10.3. The data mining process

Like most of the processes in the DW/BI system, the ...

Get The Microsoft® Data Warehouse Toolkit: With SQL Server™ 2005 and the Microsoft® Business Intelligence Toolset now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.