11 DataFlux and dfPower Studio

Introduction

Examples

Introduction

A SAS product called DataFlux (and its graphical user interface, dfPower Studio) is a product designed to perform data cleaning functions.

This product allows you to collect, store, examine, and manage various data quality and integration logic and business rules. For example, you can:

• perform address standardization

• standardize company and product names

• perform "fuzzy" matches among files

• parse names so that matching can be accomplished

• identify which records belong to the same household

• perform redundant data identification

• eliminate duplicates using clustering algorithms to look for duplicates and near-duplicate records

• merge records

• perform data linking ...

Get Cody's Data Cleaning Techniques Using SAS, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.