
Data, naturally, is fuzzy. There are many reasons for that; mistakes in typing, using abbreviations, and so on. The following screenshot (sourced from Microsoft) shows an example of two records for the same person:


From the human point of view, both the records shown in the preceding screenshot are for the same person; it just has some abbreviations and different string formats. But from the computer's point of view, these records are different; or, in the other words, they are not exactly similar.

The data matching component of DQS works with a similarity threshold between domain values. The data steward can create matching policies in the ...

Get Microsoft SQL Server 2014 Business Intelligence Development: Beginner’s Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.