O'Reilly logo

Anonymizing Health Data by Luk Arbuckle, Khaled El Emam

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 13. De-Identification and Data Quality

As is evident from the case studies we’ve presented, anonymization results in some distortion of the original data. What we want to do now is discuss the amount of distortion that can be introduced and how it can be effectively managed. We’ll focus on de-identification, not masking, because it’s de-identification that distorts the variables we might want to use for analysis. The amount of distortion is referred to as “information loss,” or conversely “data utility.”

Data utility is important for those using anonymized data, because the results of their analyses are critical for informing major care, policy, and investment decisions. Also, the cost of getting access to data is not trivial, making it important to ensure the quality of the data received. What we really want to know is whether the inferences drawn from de-identified data are reliable—that is, are they the same inferences we would draw from the original data?

Useful Data from Useful De-Identification

Although obvious, it’s worth repeating that poor de-identification techniques will result in less data utility. In fact, that is one key way to evaluate the quality of a de-identification method. Many of the de-identification techniques that we’ve described are essentially optimization problems. Some optimization methods maximize data utility, while others minimize the risk of re-identification at the expense of data utility. Not all optimization methods are created equal. The ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required