CRISP-DM

Cross Industry Standard Process for Data Mining (CRISP-DM) describes a data-mining process commonly used by data scientists in industry. CRISP-DM breaks the data-mining science process into six major phases:

  • Business understanding
  • Data understanding
  • Data preparation
  • Modeling
  • Evaluation
  • Deployment

In the following diagram, the arrows indicate the process flow, which can move back and forth through the phases. Also, the process doesn't stop with model deployment. The outer arrow indicates the cyclic nature of data science. Lessons learned during the process can trigger new questions and repeat the process while improving previous results:

Get Machine Learning in Java - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.