This book is not for privacy experts.
If you are looking for an in-depth discussion of the legal implications of the Kyllo v. United States (2001) Supreme Court decision or thorough exploration of how to implement differential privacy in a database, then you should look elsewhere. There is no shortage of invaluable literature on these and many other privacy-related topics, and we recommend it to those readers.
This book is for privacy beginners. Those who have a niggling worry that the technology they are creating raises privacy concerns and want to do something about it, but who also are not going to spend the next 10 years perusing privacy case law and academic papers trying to figure out how to port those lessons into lines of code.
You may be surprised how frequently what you build has privacy implications, but we live in a time of increasing capabilities for personal re-identification. This book will help you be familiar with how to spot privacy questions. If you read nothing else but Chapter 1, you’ll understand better how to judge whether or not what you’re doing is connected to data privacy.
Whether you’re building a new smartphone app in your dorm room or a database empire from your garage, The Architecture of Privacy will be your first step into the world of privacy engineering.
Decisions made by engineers can unleash technology upon the world that can significantly affect fundamental rights. In some cases, this can yield positive outcomes such as the creation of new platforms to exchange ideas that catalyze change in the world’s most oppressive regimes. In other cases, new technologies can become tools of repression and control, enabling governments and corporate interests to track and manipulate individuals with surprising subtlety and at remarkable scale. With such high stakes, it must be in the interest of more than just lawyers and bureaucrats to recognize, promote, or guard against these potential outcomes as needed.
This book is, in part, an effort to empower the engineer. Successful technology is not just technology that works; it is technology that works while simultaneously offering capabilities that protect privacy and civil liberties. Readers of this book will not have to watch helplessly as their technology is misused, nor will they have to wait for others to try to curb that misuse. Instead, they will have the tools to recognize potential risks and design against them, sparing much headache and heartache.
This book is distinctive in the realm of privacy literature as it is written by technical authors who approach privacy and civil liberties from what is currently a highly atypical perspective: how to engineer technologies that will deliver trustworthy safeguards capable of supporting liberal-democratic principles. By contrast, most privacy books are written by professional scholars who take law and policy as their starting point and treat technological concerns as ancillary at best and menacing at worst, which is hardly a perspective that will encourage the engineers of the world.
But this book is not just for engineers. For the non-engineers who read this book—the academics, lawyers, and policymakers—we offer a new perspective. The policy choice is not simply to build or not to build, to ban or not to ban. Instead, these readers will find that engineers can offer an arsenal of technical tools that can form the building blocks of nuanced policies that maximize both privacy protection and utility. This book provides a menu of what to demand in a new technology.
Over the past few years, the public has become aware of the vast scale of data collected and held by governments and corporations. As we produce more data about ourselves through the ubiquitous use of electronic payment systems, mobile devices, and cloud computing services, the institutions around us have concluded that this data holds tremendous value. Unfortunately, the private companies and government agencies that hold data about us do not always put appropriate safeguards in place to prevent deliberate or accidental privacy violations. Sometimes this is because of gaps in their internal policies, or because they misjudged risk or their ability to mitigate it. But sometimes it’s because these organizations don’t have data management systems that offer the technical capabilities necessary to support robust, privacy-protective policies.
That need not be the case. Today we know enough to design systems that build in, from the beginning, appropriate safeguards that can substantially reduce the chances of abuse or mistakes when handling people’s sensitive data. We believe it is time to move away from all systems that don’t have these straightforward and sensible protections in place. We have become heavily reliant on advanced information technology, and we need to be able to trust our systems and each other with our data.
Effective privacy-facilitating technology is designed to minimize the friction between a person and their work. Capabilities can and should be designed in such a way that they enable privacy-protective policies and procedures while creating as few hurdles as possible in using the system. The easier privacy protections are to use and the more unnoticeable they are to everyday users, the more likely these protections will be embraced. As soon as a privacy-protective feature becomes cumbersome, some users will look for ways to avoid it or develop shortcuts that will undermine its overall effectiveness. We advocate reducing potential friction by adhering to what is sometimes referred to as “privacy by design”—an approach that incorporates thinking about privacy-protective features and implementing them as early as possible. Capabilities that are part of the core functionality of the product are far less likely to cause friction than those simply grafted on to the technology late in the development process. Specific advice on how to incorporate privacy by design into your product can be found in extensive documentation on the topic elsewhere.
It is important to note that nothing described in this book could be said to automatically protect privacy. Simply having these capabilities in your system won’t guarantee that privacy is protected. Rather, these capabilities must work in concert with legal frameworks and policy in order to be effective. Privacy law is an extremely nuanced field that often depends on subjective evaluations of the legitimacy of certain actions (and those evaluations can change rapidly depending on outside factors)—something that is very difficult to hardcode into a technology.
Access controls, for example (see Part II), are a powerful tool for managing data use, but a user must configure those controls in order to ensure that data is accessed by those who have the authority to see it, and denied to those who do not. Meanwhile, the mere existence of audit logs (see Part III) is not enough to ensure rigorous oversight of system usage—someone must actually read those logs and take effective action when they see misuse of the data. Though just about anything is possible in the world of technology, we should maintain healthy skepticism of any technology that claims to automatically protect privacy while maximizing data utility.
Most likely, any attempt to automate privacy protection is going to lead to a system that is either unnecessarily restrictive, thereby undermining the utility of the system, or too permissive, thereby leaving ample room for misuse of the data (which might not be caught because oversight is reduced on the erroneous assumption that the system can govern itself).
We have tried to write this book in a way that allows readers to skip around, focusing on the topics most relevant to their needs. But we’ve also tried to ensure that the book hangs together as whole. Our narrative thread therefore goes something like this:
Whenever you collect and process personal and/or sensitive data, you have an obligation—moral in all cases, legal in most—to protect that data from theft, misuse, and abuse. You are directly responsible for designing and implementing security-enhancing and privacy-protective technologies and policies. This is hard! Understanding the different ways in which data can be personally identifying, recognizing the privacy risks associated with different technologies and use cases, implementing measures to mitigate those risks without compromising your original goals, and staying up-to-date on relevant law and policy are complex challenges, and there’s no guaranteed recipe for success. There are, however, several broad categories of technology and policy that are foundational to protecting privacy and civil liberties, and you’ll want to build on these strong foundations.
The opening four chapters of this book focus on the fundamental building blocks necessary to create a privacy-protective system. Chapter 1 is a brief history of the intersections of informational privacy, technology, and privacy law, which situates the reader in the context surrounding these issues. Chapters 2 and 3 cover the data collection technology, policy, and practices that should be transparent to your users or data subjects and should ensure that the kind and amount of data collected is proportional to your product’s or service’s stated purposes. Chapter 4 addresses high-level information security technology and policy needed to protect data from theft and other forms of unauthorized access.
Privacy technology and policy should ensure that data accessed through authorized means is protected from misuse and abuse. This goal is best achieved through some combination of access control and oversight measures.
Chapters 5, 6, and 7 address various ways of restricting and controlling authorized access to data. We describe how to grant differentiated access to the various levels of your system (e.g., application, network, hardware, etc.) and apply controls to varying levels of granularity in your data (e.g., system-level, record-level, cell-level, etc.). We describe different types of access (e.g., read, write, discovery, etc.) and conditions under which access is granted (selective-, purpose-, and scope-driven revelation). We describe federated system architectures that delegate some access-control decisions to the owners of systems separate from your own.
Chapters 8, 9, and 10 center on oversight, the necessary counterpart to access control. In order to hold the system and your users accountable, we present techniques for logging user activity in a way that makes data use auditable. We explain how data retention policies and data-purging technologies should be designed and implemented in a way that complies with regulations and minimizes privacy risks without compromising the usefulness of the system.
In Chapter 11, we walk through several case studies that demonstrate how these various building blocks can be assembled to solve real problems. In Chapter 12, we describe the role and responsibilities of the Privacy Engineer, an individual who will become increasingly critical to companies that process personal information. Finally, in Chapter 13, we share some thoughts on the future of privacy and how you can prepare for it.
In general, think of the capabilities described in the chapters that follow as a set of building blocks. They can be combined in a variety of ways to support different privacy imperatives. However, not all of these capabilities need to be used in every information system, and not all privacy issues that might arise from the use of those systems can be solved by these technologies.
Technology professionals, software developers, web designers, and business and creative professionals use Safari Books Online as their primary resource for research, problem solving, learning, and certification training.
Members have access to thousands of books, training videos, and prepublication manuscripts in one fully searchable database from publishers like O’Reilly Media, Prentice Hall Professional, Addison-Wesley Professional, Microsoft Press, Sams, Que, Peachpit Press, Focal Press, Cisco Press, John Wiley & Sons, Syngress, Morgan Kaufmann, IBM Redbooks, Packt, Adobe Press, FT Press, Apress, Manning, New Riders, McGraw-Hill, Jones & Bartlett, Course Technology, and hundreds more. For more information about Safari Books Online, please visit us online.
Please address comments and questions concerning this book to the publisher:
We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at http://bit.ly/architecture-of-privacy.
To comment or ask technical questions about this book, send email to firstname.lastname@example.org.
For more information about our books, courses, conferences, and news, see our website at http://www.oreilly.com.
Find us on Facebook: http://facebook.com/oreilly
Follow us on Twitter: http://twitter.com/oreillymedia
Watch us on YouTube: http://www.youtube.com/oreillymedia
All the authors wish to acknowledge the extraordinary efforts of Elissa Lerner, editor of this book. Put simply, this book would not exist without her tireless efforts to herd this unruly band of cats as they blew through deadline after deadline and were readily distracted by new and exciting tangents.
We also wish to thank Palantir Technologies and its CEO, Dr. Alex Karp, for encouraging us in this effort. Although the book and its contents in no way represent the views of Palantir Technologies or any of its other employees, these authors would not have met and set out on this course without the support of the Palantir family and their tireless dedication to making the world a better place through technology.
We acknowledge a huge debt of gratitude to Dr. Lawrence Lessig, whose work in this space inspired both the title of this book and our whole approach to the interaction of legal and technical code. We also thank Paul Ohm for contributing an insightful Foreword to this book. Do yourself a favor—find everything that these two have ever written and read it.
We also wish to thank those who provided invaluable comments on this book at various stages of its life: Asher Sinensky, Kyle Erickson, Brendan Cooney (and the legal ninjas), Andy Oram, Nathan Good, and Seth Schoen.
Special thanks to the rest of the Privacy & Civil Liberties Engineering team at Palantir for acting as our research arm and teachers on all things privacy. Special thanks to the distinguished members of the Palantir Council of Advisors on Privacy and Civil Liberties (PCAP) for providing encouragement and fodder for this effort through so many enlightening discussions. Special thanks to the engineering teams at Palantir, for spending years imagining, building, and perfecting many of the architectures that we describe in this book. In a very real way, the authors are just the messengers, the documenters of the hard technical work that go into creating these systems.
Special thanks to Mike Loukides at O’Reilly for being as excited about this book as we were and helping us to make it happen. Inspiration for this book came out of meetings at O’Reilly’s Foo Camp, specifically a session run by Brian Fitzpatrick and Harper Reed on Internet privacy in the post-Snowden era.
To my co-authors, who enrich my working life with their erudition and passion for the content of this text, and whose company and good humor were the epidural to this protracted labor of love. To Kyle Erickson and especially Elissa Lerner for knowing when humor, gentle prodding, good-natured public shaming, and other more medieval editorial machinations were needed to prod me along; I quite literally would not have done it without your indefatigable encouragement. And, most of all, to Sarah, whose support, understanding, kindness, and unwavering affection remind me daily that the sacred spaces we aim to preserve and protect through privacy engineering matter most when the personhood cultivated therein can ultimately also be shared.
To my wife, for indulging yet another professional distraction that draws me out of our happy home and for taking care of the bedtimes that I’ll miss while I’m out playing author. To my children: you are the inspiration for my wanting to make the world a better place to live in. Consider this a small part of my efforts to build a world that is safe for you to live in. To Chris Dibona and Tim O’Reilly for your encouragement, sometimes intentional and sometimes incidental. And to my parents: for never being impressed enough with my work to let me feel satisfied. You keep me moving. And finally, to my co-authors for tolerating my rewrites and doing the bulk of the work in writing this book.
To Mike van Opstal, the engineer in my life. To my family and friends, who have to listen to me rant and rave on privacy and civil liberties on a daily basis and who hoped that this effort might purge my soul (no such luck). To Dr. Karp and Palantir for believing that privacy engineering could be a real job and letting me turn my passion into my life. To those who risk their lives every day to protect the ideals of a free society and labor to bring freedom to all those who want for it.
To my teachers, who shared with me the love of a good sentence. To my friends, Elissa Lerner and Kyle Erickson, from whom I have learned so much, and who, with their erudition, refined linguistic sensitivity, and dauntless persistence for excellence, made this a far better book than it would otherwise have been; they are as much authors of this book as any of us, regardless of what ended up on the cover page, and all should know this. To John, Ari, and Courtney, for their wisdom and good humor along the path of shared suffering. To my family and those who came before them who left the lands of tyranny to build lives of ordered creativity in a new, free land, and without whom none of this would have been possible.