O'Reilly logo

Mastering SQL Server 2014 Data Mining by Debarchan Sarkar, Amarpreet Singh Bassan

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Understanding and cleansing data

Before entering the warehouse, the data should go through several cleaning operations. These involve shaping up the raw data through transformation, null value removals, duplicate removals, and so on. In this section, we will discuss the techniques and methodologies pertaining to the understanding and cleansing of the data, we will see how we can identify the data that we need to cleanse, and how we will set a benchmark for the data sanity for further data refresh.

Data cleansing is the act of detecting the data that is either out of sync, not accurate, or incomplete, and we either correct, delete, or synchronize the data so that the chances of any ambiguity or inaccuracy in any prediction that is based on this ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required