Chapter 1. What Is Privacy?

Privacy incursions in the name of progress, innovation, and ordered liberty jeopardize the continuing vitality of the political and intellectual culture that we say we value.

Julie E. Cohen, professor of law, Georgetown University

Privacy is important. These three words comprise the philosophical compass for this book, and summarize (albeit inelegantly) the eloquent description above regarding the consequences of ignoring privacy. For us, privacy serves not only as a bulwark against threats to individual liberty and society as we know it, but also as a cornerstone of a thriving economy rife with innovation.

There has long been and continues to be roiling societal debates on the topic of privacy. Every reader of this book will come with their own conception of why privacy is important. What do you see as a threat to privacy? How significant are those threats? And most importantly, what role will your technology play in shaping the world in which those threats exist? If you are reading this book, then you have probably asked yourself these questions and, in your own way, reached the same conclusion we have: privacy is important.

Proceeding from that premise, we then assert that engineers can and should take privacy into account when designing and building technology. There is a long history of interaction between policy and technology that demonstrates just how important a role engineers can play. Thinking carefully about the architecture of privacy will show that it is possible to build systems that make it substantially easier to protect privacy and much more difficult to violate it, intentionally or otherwise. This book will help you do that.

How to Think About Privacy

In order to build technology that can help protect privacy, we must first understand privacy and how it is shaped by law, policy, and technology. Though we often take its meaning for granted, privacy is neither a simple concept nor can it be assumed that everyone defines it the same way. Privacy can encompass a broad swath of sometimes interrelated and often overlapping ideas. It is also a moving target—the concept changes and adapts over time.

In this section, we define privacy for the purposes of this book. We also examine how technology has interacted with legal and policy development (and vice versa) to shape the concept of privacy. This is not meant to be a comprehensive history of privacy, but rather to provide some context to this complex interaction that will help you understand the broader environment in which your technology must operate.

Defining Privacy

A single definition of the word “privacy” has been historically difficult to pin down. Definitions of privacy have always been reflections of contemporary contexts, resulting, perhaps unsurprisingly, in what legal scholar Daniel Solove describes as a “concept in disarray.”¹ Consequently, this concept can plausibly encompass no less than the “freedom of thought, control over one’s body, solitude in one’s home, control over personal information, freedom from surveillance, protection of one’s reputation, and protection from searches and interrogations.”²

Even documents regarded as essential bulwarks against encroachment on individual privacy turn out to be surprisingly vague on the topic. The United States Constitution, for instance, does not contain the actual word “privacy.”³ Other documents do not eschew the word, but they do not offer much help in defining it. The Universal Declaration of Human Rights, a component of the United Nations-created International Bill of Rights, asserts in Article 12 that “No one shall be subjected to arbitrary interference with his privacy.” The European Convention on Human Rights, for its part, was only able to muster a “right to respect [the individual’s] private and family life, his home and his correspondence.” Consequently, it has been left to legislatures, courts, advocates, and academics to actually flesh out the elusive meaning behind these seven letters.

Broadly speaking, experts sort conceptions of privacy into two categories—informational privacy, which concerns “the collection, use, and disclosure of personal information,” and decisional privacy, which relates to “the freedom to make decisions about one’s body and family.”⁴ Given that this book focuses on technologies that work with data, we will concentrate our discussion on informational privacy. However, this should not be read to suggest that the adoption of these capabilities will only affect informational privacy. If information truly is power (as is frequently asserted), then the ability to control information about oneself can have a direct effect on the freedom one has to think and act independently. Thus, informational privacy cannot be wholly divorced from decisional privacy, and addressing one necessarily implicates the other.

A Short History of U.S. Informational Privacy

The concept of informational privacy has evolved over time. Tracing its history shows that the development of technology and privacy law and policy are closely intertwined. Changes in one area can have significant effects on the other. A historical review also illustrates how, in many cases, the same core issues we face today are merely the latest permutation of long-standing challenges. Understanding how informational privacy has developed in one jurisdiction will not only help us understand its current state but also its potential evolution in the future.⁵

More than 120 years before the seeming omnipresence of information-sharing platforms like Facebook, Instagram, Snapchat, and their kin, there was Kodak. In 1888, the Kodak camera was introduced to the American public, allowing anyone to capture and share moments of peoples’ lives like never before. Concerns about the privacy implications of the new technology quickly followed. “Beware the Kodak,” lamented The Hartford Courant, “The sedate citizen can’t indulge in any hilariousness without incurring the risk of being caught in the act and having his photograph passed around among his Sunday School children.”⁶

The legal community soon took notice. In 1890, Samuel Warren, a prominent Boston attorney, and Louis Brandeis, later to serve as a Supreme Court justice, published “The Right to Privacy” in the Harvard Law Review, an article widely considered one of the most influential in the American legal canon and still cited in court opinions to this day. The article began by briefly charting the development of the concept of privacy up until that point before determining “Recent inventions and business methods call attention to the next step which must be taken for the protection of the person…. Instantaneous photographs and newspaper enterprise have invaded the sacred precincts of private and domestic life…’”⁷ After outlining the perceived harms of these intrusions, Warren and Brandeis looked to the existing common law (i.e., law developed over time by judges as they decide cases) for the foundations of a “right to be let alone.”⁸ Notably, they also delineate limits to this right, suggesting that at some point “the dignity and convenience of the individual must yield to the demands of public welfare or of private justice.”⁹

Thirty years later, Brandeis went on to erect yet another pillar in privacy history with what became one of the most frequently cited dissenting opinions in U.S. Supreme Court history. In 1928 in Olmstead v. United States, the Court determined in a 5–4 decision that federal agents could wiretap a phone without obtaining judicial approval. In a fiery dissent, Brandeis reaffirmed the importance of “the right to be let alone—the most comprehensive of rights and the right most valued by civilized men.” Brandeis chided the majority that “time works changes, brings into existence new conditions and purposes” and therefore the Court must be prepared to apply constitutional protections to situations not envisioned by the Framers, which in this case meant applying the Fourth Amendment protections against unreasonable search and seizure beyond the “sanctities of a man’s home.” He warned that technology would continue to challenge the Court’s conception of privacy protection. “The progress of science in furnishing the Government with means of espionage is not likely to stop with wiretapping,” he wrote. “Ways may someday be developed by which the Government, without removing papers from secret drawers, can reproduce them in court, and by which it will be enabled to expose to a jury the most intimate occurrences of the home.”

However, it would be nearly four decades before the Supreme Court would recognize Brandeis’ prescience, and it would be an invention that had at that point already existed for more than 90 years, the telephone, which would inspire the Supreme Court to a new era of privacy protection. Noting that it could not “ignore the vital role that the public telephone has come to play in private communication,” the Supreme Court, in Katz v. United States (1967), declined to follow the Olmstead majority’s view that the Fourth Amendment be narrowly construed to apply only to the home, finding instead that it “protects people, not places.”¹⁰ It therefore could protect activity conducted in areas accessible to the public from government surveillance (in this case, requiring a warrant for wiretapping a public telephone booth). The near-ubiquitous adoption of technology over the better part of a century had at last dragged the law forward.

By the 1960s, the growth of the post-New Deal government combined with the post-war economic and population boom resulted in an explosion in the number of records kept about people in both the government and private sector.¹¹ Computerized record-keeping, which had begun as early as the 1890 U.S. Census using Herman Hollerith’s mechanical tabulator, was not just convenient—it was becoming essential as a means of managing an ever-increasing volume of data.

As data proliferated, academics and activists became increasingly concerned with how this data was being managed—particularly since much of it was then in the hands of the U.S. government and there was little transparency as to how it was being used. In 1972, the Secretary of the Department of Health, Education, and Welfare (HEW)¹² established the Secretary’s Advisory Committee on Automated Personal Data Systems. It was formed to address “growing concern about the harmful consequences that may result from uncontrolled application of computer and telecommunications technology to the collection, storage, and use of data about individual citizens.”¹³ In assembling the Committee, Secretary Elliot Lee Richardson specifically cited technological innovation as the driver of this reassessment of privacy. “The use of automated data systems containing information about individuals is growing in both the public and private sectors…,” he wrote. “The Department itself uses many such systems…. At the same time, there is a growing concern that automated personal data systems present a serious potential for harmful consequences, including infringement of basic liberties.”

In response, the Committee produced “Records, Computers and the Rights of Citizens,” a (perhaps surprisingly) helpful government report that continues to influence privacy policy to this day. The report was submitted on June 25, 1973, the same day that John Dean, former White House Counsel, testified before the Senate Watergate Committee that President Richard Nixon was involved in the cover-up of the Watergate burglary. Allegations of governmental abuse of power pervaded the zeitgeist when the HEW Committee concluded that, “Under current law, a person’s privacy is poorly protected against arbitrary or abusive record-keeping practices.”

The Committee went on to propose a set of principles that should apply to the construction and use of automated personal data systems. These principles would eventually come to be known as the Fair Information Practice Principles (FIPPs), and they have been adopted around the world as the basic framework of information-privacy legislation and policy.

The FIPPs have been formulated in a variety of ways, and carry significant weight in the operational and technical frameworks of privacy. They are summarized in the following sidebar.

These principles are now enshrined in such diverse places as the U.S. Privacy Act of 1974, the European Union Data Protection Directive, the Australian Privacy Act’s Information Privacy Principles, the Singaporean Personal Data Protection Bill, India’s Information Technology Rules (formally, The Reasonable Security Practices and Procedures and Sensitive Personal Data or Information Rules), and a number of other national laws and policies that make up today’s privacy landscape.¹⁴

Today

The HEW report and the legislation that flowed from it over the course of two decades represent arguably the last major watershed moment in informational privacy law development. Since then, the legal infrastructure has been strained to the breaking point as policymakers and judges struggle to apply decades-old law to technology that was barely imaginable when those laws were passed.

The U.S. Privacy Act, for one, has not been substantially amended since its initial enactment in 1974, forcing innovators in data processing technology to figure out how to fit sophisticated new data structures into the filing cabinet-record paradigm that characterizes the Act. The Electronic Communications Privacy Act (ECPA), meanwhile, which governs U.S. federal law enforcement’s use of wiretaps, pen registers, trap-and-trace devices, and the interception of electronic communications such as email, was enacted in 1986—long before most Americans had even heard of the Internet, let alone adopted it as one of their primary modes of communication and commerce. Consequently, ECPA has created a set of confusing, inconsistently applied standards, yielding strange results.¹⁵

U.S. state privacy laws have fared somewhat better, with states creating context-specific privacy requirements for an assortment of data types (e.g., bank records, insurance, educational information).¹⁶ However, each state takes a different approach to privacy. When state laws conflict with federal laws, legislatures and courts are forced to engage in complex legal analysis to determine which system should take precedence. This often leads to confusing outcomes. Such a hodge-podge of privacy rules often leave multistate and multinational businesses scrambling for strategies to build one product or adopt one policy that meets the requirements of every state.

Meanwhile, the European Union’s sometimes aggressive enforcement of assorted Member State “data protection” laws has led to stronger global privacy practices as multinational companies hoping to operate in Europe attempt to comply.¹⁷ Yet even these laws are built on the foundation of the FIPPs, and are cracking under the strain of the new paradigm of contemporary data scale and complex analytics. The European Union has proposed an update to its data protection regime, which is discussed in more depth in Chapter 11.

Outside of legislatures, the courts have fared little better in trying to keep pace with technological development. In one of the more significant privacy decisions of the last twenty years, Kyllo v. United States (2001), the Supreme Court ruled that police would be required by the Fourth Amendment to obtain a search warrant in order to direct a thermal imaging device at a private residence. Acknowledging that “[i]t would be foolish to contend that the degree of privacy secured to citizens by the Fourth Amendment has been entirely unaffected by the advance of technology,” the Court concluded that an unreasonable search has occurred because “here, the Government uses a device that is not in general public use, to explore details of the home that would previously have been unknowable without physical intrusion.”

The Court’s use of “not in general public use” could be read to suggest that the Court “deliberately adopted a rule that allows the outcome to change along with society,” thereby trying to create a standard of privacy protection that adapts with the growth of technology.¹⁸ But since there have been few follow-up cases along this line, it is hard to determine if the Court’s rule was actually successful or if it just created more confusion without adding any real protection against intrusions on personal, informational privacy.

Lastly, the United States Federal Trade Commission (FTC) has taken on a lead role in protecting consumer privacy. It is worth noting that they have done so not under the auspices of any of the aforementioned privacy laws but rather pursuant to their authority under Section 5 of the Federal Trade Commission Act (15 USC 45), which prohibits “unfair or deceptive acts or practices in or affecting commerce.” The FTC has used this authority to bring legal action against organizations that they argue have deceived consumers by failing to live up to their promises to handle consumers’ personal information in a secure way.

The FTC has developed such a reputation that some scholars have claimed that “today FTC privacy jurisprudence is the broadest and most influential regulating force on information privacy in the United States—more so than nearly any privacy statute or common law tort.”¹⁹ However, the overall effectiveness of these actions in providing more consumer privacy protection must be measured in light of the fact that it is primarily dependent on the FTC policing the organizations’ assertions about their own behavior. This means the level of privacy protection is driven not by government regulation itself but by the organizations’ decisions about the level of privacy protection they’d like to provide their own customers.

While law lurches along haphazardly, technology continues to leap forward. In 2013, the Pew Research Center’s Internet & American Life Project reported that more than 90% of Americans owned cellular telephones, and some suggested that the adoption of the smartphone was outpacing the spread of any other technology in human history. The massive amount of transactional and geolocational data generated by these mobile devices contributes to the larger trend of an exponential growth in the amount of stored data in the world, which by one estimate reached around 1,200 exabytes in 2013.²⁰ Attempting to describe the state of the “big data” world in 2013, economists Viktor Mayer-Schönberger and Kenneth Cukier coined the term “datafication” to refer to “taking information about all things under the sun—including ones we never used to think of as information at all, such as a person’s location, the vibrations of an engine, or the stress on a bridge—and transforming it into a data format to make it quantified [allowing] us to use the information in new ways.” (See Chapter 11 for more on datafication.) An entire industry of big data analytics has emerged to take advantage of these mountains of information, often developing techniques that can extract unexpected insights (sometimes relating to deeply personal subjects) from seemingly innocuous data.

These vast reservoirs of data—in particular, personal data about individual behavior—have not only been a boon to the commercial sector, they have also provided a treasure trove of information for governments. Police departments, intelligence services, and government agencies of all kinds have harnessed the power of data analytics to do everything from eliminating inefficiencies in housing-code violation investigations to anticipating crime outbreaks to capturing terrorists. Privacy and civil liberties advocates have long expressed concern at the extent to which some of this information is being collected and used by governments, but for the most part they could only speculate as to what was happening behind the veil of secrecy shrouding the clandestine services.

This all changed on June 5, 2013, when The Guardian revealed the bulk collection of telephony data by the U.S. National Security Agency on a scale that shocked many observers.²¹ Four days after breaking the news, The Guardian introduced the world to Edward Snowden, a former NSA contractor who executed one of the largest intelligence leaks in U.S. history in order to reveal “the federation of secret law, unequal pardon and irresistible executive powers that rule the world.”²² The ongoing release of classified materials has triggered one of the largest public discussions about privacy, and one of the most significant reviews of U.S. intelligence activity, since the Church Committee investigated CIA and FBI domestic abuses in the 1970s.

Once more, the law is scrambling to catch up with new technological developments. A declassified opinion of the U.S. Foreign Intelligence Surveillance (FISA) Court, the body charged with judicial oversight of certain intelligence community activities, acknowledged as much when it found that Fourth Amendment protections did not apply to the collection of “non-content telephony metadata.” It also suggested that this conclusion (which relied on a 1979 Supreme Court decision) would do well to be revisited by the Supreme Court “in the context of twenty-first century communications technology.” Other courts have reached similar conclusions, and a robust debate over these issues continues in courtrooms, classrooms, and legislative hearing rooms around the world. While it remains unclear how these issues will be resolved in the coming years, it is clear that technological development will continue to be one of the driving forces in shaping an individual’s privacy rights.

“East Coast” Code and “West Coast” Code

Technologists may think themselves helpless in the face of legal developments, resigned to waiting for society to react to a new technology and adapt law and policy to the new technological paradigm. In reality, technologists may have as much influence on the development of the law as the law does on technology. Consequently, the technology described in this book should not be thought of as just a reaction to the requirements of law but also as a potential means of shaping the ultimate legal outcomes.

As history illustrates, the interaction of privacy law and technological innovation can seem like billiard balls on a table. Often they appear to be largely separate worlds that occasionally collide, sending one or both careening off in a new direction, each one affecting the other in different ways but never merging. Inventors and engineers solder wires and write computer code, but their understanding of the law tends to be limited to the rules defining what they can and cannot do. Lawyers and policymakers, meanwhile, only become aware of new technology when it reaches a critical mass of usage in popular society, and they often spend years trying to understand how this new technology changes the world around them and then deciding how the law should (or should not) react to those changes.

However, we believe law and technology cannot and should not operate in separate worlds. Ideally they should work together, with technologists understanding and designing technology based on a solid grasp of relevant law and policy, and lawyers and policymakers understanding technological capabilities in order to better inform and even support their policy decisions. This concept derives from Harvard law professor Lawrence Lessig, who, in 1999, sought to explain the state of regulation in the nascent world of cyberspace:

“The single most significant change in the politics of cyberspace is the coming of age of this simple idea: The code is law. The architectures of cyberspace are as important as the law in defining and defeating the liberties of the Net. Activists concerned with defending liberty, privacy or access must watch the code coming from the Valley—call it West Coast Code—as much as the code coming from Congress—call it East Coast Code.”

Lessig later clarified further: “The lesson of code is law is not the lesson that we should be regulating code, the lesson of code is law is to find the right mix between these modalities of regulation to achieve whatever regulatory objective a government might be seeking.”

The so-called “West Coast” code and “East Coast” code can interact in a variety of ways.²³ In some cases, “West Coast” code defines the physics of the world in which “East Coast” code can operate. The very design of devices and the networks that support them establishes the boundaries of the environment within which policymakers can operate. For example, the creation of biometric authentication technology allows policymakers to require the use of such capabilities to secure sensitive systems. In other cases, “East Coast” code directly limits what “West Coast” code can do. For example, cybercrime laws prohibit the creation of malicious code. It is the complex spectrum between these two extremes that generates the sizeable range of options available to the thoughtful, privacy-minded software engineer.

Consider the development of cellular phone capabilities. Back in ancient times, cell phones were relatively simple devices used to connect two people for a voice conversation. Today, they can contain (and generate) substantial amounts of information touching almost every aspect of our lives. Cell phones can now store gigabytes of information in the form of documents, pictures, videos, and other types of files. They can also run various applications that allow them to access other troves of information such as server-based email accounts.

While useful and driven by consumer desire for such access, the storage of this data has led to some challenging new issues under U.S. Fourth Amendment “search and seizure” law, and the development of certain cell phone capabilities can have a profound effect on personal privacy and fundamental freedoms. The Fourth Amendment to the U.S. Constitution prohibits agents of the government from conducting “unreasonable searches and seizures” of “persons, houses, papers, and effects” without a judicially issued warrant based on a finding that there is “probable cause” to believe that evidence of a crime or contraband will be found. There are several judicially created exceptions to this stricture, including one that has been interpreted to allow law enforcement officers to seize cell phones as part of a search incident to arrest and review the contents of those phones without obtaining a search warrant.

For a long time, courts were split over the validity of these searches. Some have suggested that because the phone is on the arrestee’s person and may contain evidence, seizing the phone constitutes little more than reading the contents of a piece of paper found in the arrestee’s pockets. Others have argued that the sheer volume of information available on the device changes the analysis, as law enforcement officers would normally only be able to obtain such extensive information via warrants that authorize the search of a computer hard drive or subpoenas requesting access to stored emails from a third-party email provider. Eventually in 2014, the Supreme Court, in United States v. Riley settled this question, finding a substantial distinction between the contents of one’s pockets and the contents of one’s cell phone:

“Modern cell phones, as a category, implicate privacy concerns far beyond those implicated by the search of a cigarette pack, a wallet, or a purse. A conclusion that inspecting the contents of an arrestee’s pockets works no substantial additional intrusion on privacy beyond the arrest itself may make sense as applied to physical items, but any extension of that reasoning to digital data has to rest on its own bottom.”

Thus, we can see that “West Coast” decisions to create devices with substantial storage capacity has required “East Coast” counterparts to reconsider long-standing legal doctrines.

Another thorny issue surrounds geolocational data generated by cellular phones. Geolocational information can be generated any time a phone call is made, any time a text is sent, any time an application relies on geolocation data (e.g., an application providing information on vehicle traffic), and even any time a device passively “pings” a cellular tower as it moves in and out of coverage areas. This information can be stored on the phone, with the cellular provider, and with the maker of the application, thus creating a potentially enormously valuable data source for law enforcement and intelligence agencies.

But this body of information exists because of “West Coast” decisions to design systems that generate and store it. “East Coast” law enforcement and intelligence policymakers then responded to the creation of this entirely new set of data by integrating it into their investigatory techniques. Ultimately, the public, courts, and policymakers are left to debate and decide if this is an appropriate use of the data, and whether there should be legal restrictions on the use of this information both by the public and private sectors. In this case, the “West Coast” code created an entirely new source of information that fundamentally changed the relationship of the individual users to their devices (we now essentially carry tracking devices in our pockets) and it was done in a relative “East Coast” code vacuum, thereby creating a great deal of uncertainty regarding the power of the government and others to track our every move.

These are just two examples that serve to illustrate the complexity of the technological and legal landscape, and in many ways, even these cases are overly simplified. “West Coast” and “East Coast” are hardly monoliths defined by a single motivation or goal. Instead, they are both composed of constantly shifting coalitions of interests, including individual coders motivated sometimes by profit and sometimes by altruism; businesses with substantial economic stakes in both legal and technical outcomes; policymakers torn between protecting privacy, preventing crime and threats to national security, and promoting economic growth in the tech sector; advocacy organizations looking to foster a free and independent cyber world while at the same time trying to curb the potential for nefarious exploitation of this world; and individual consumers eager to take advantage of useful and fun new technologies while anxiously trying to preserve a seemingly dwindling sphere of private life. Each of these interests can and often do shift from looking to either “West Coast” code or “East Coast” code to address any given concern.

Why Privacy Is Important

The historical influence of technology on privacy law raises the question—did technological innovators have privacy in mind when they designed their products? When George Eastman introduced the Kodak camera, how much thought did he give to its ultimate effect on individual privacy? Did he imagine a world of candid, snapshot photography and wonder how it would affect, for better or for worse, the photographer and the photographed? Did he hesitate for a moment before pulling away the cloth to unveil his invention? Did he consider ways to modify the technology to better protect privacy?

Perhaps the better question to consider is why Eastman, or any technological innovator, would even want to consider these questions in the first place. In today’s society, at least, there are a number of potentially significant consequences—both practical and ethical—for businesses that fail to consider the privacy implications of their work.

On the practical side, innovators today face a complex web of privacy law at the state, federal, and international levels. Failure to comply with these laws can open the door to sizeable civil lawsuits, or substantial government fines. Here are just a few recent examples:

In 2011, Facebook settled a class action lawsuit for $20 million for using the names and pictures of members in “Sponsored Stories” without their consent. Facebook has also agreed to aggressive oversight from the U.S. FTC that could lead to further fines if the company is found to share user information without proper notice and consent.
Google settled with the FTC in 2012 for $22.5 million for bypassing the privacy settings of the Safari mobile browser. In addition, Google has been fined by a number of European data-protection authorities (and is under investigation by several others) for violation of privacy laws.
Smaller businesses are not immune. In 2013, the makers of a social networking application called Path were fined $800,000 by the FTC for collecting personal information from children without parental consent.
A four-employee smartphone application developer called W3 Innovations agreed to a $50,000 fine paid to the FTC for similar violations involving the collection and sharing of data from children.

Steep fines like these create incentives to build or buy products that can facilitate the privacy-protective practices demanded by regulators. But aside from financial penalties, companies might also be in the market for such products to help proactively assuage the concerns of a privacy-sensitive customer base. Any customer with sensitive data will likely prefer a product or a service provider that can keep their information safe from theft or misuse, and otherwise handle data appropriately. Innovators could also favor privacy-protective products to circumvent any bad publicity that might doom a new product before it ever has a chance to flourish.

Government organizations and the businesses selling to them, will face similar pressures. Statutes, regulations, and policy can all require the implementation of complex data-handling procedures. Meanwhile, public opinion can sometimes demand the implementation of privacy-protective measures before data-driven programs can win broad support. The product designers who anticipate these considerations as they build their offerings will often have a business advantage over those who have not incorporated privacy-protective technologies into their core design.

Another practical consideration is the need to hire the best talent. Most companies will only be as good as their engineering talent, and many of those engineers will want to be challenged by their work. Engineers want to work at companies at the vanguard of their respective fields, and innovative data privacy solutions are part of what is considered the cutting edge—this alone may prove attractive.

But there is the ethical component to consider as well. Engineers working for a company that is regularly implicated in privacy violations or that sells its product to companies or countries that might misuse that technology may not only potentially face the pricking of their own conscience but also the disapproval of their fellow engineers. This latter point should not be taken lightly. In robust online communities in which many play an active part, reputation is paramount. A company that dedicates itself to doing business in a way that enhances privacy protection at best, and at the very least does no harm to individual privacy, may have an easier time appealing to engineering talent.

Finally, technologists may wish to take steps to protect their users’ privacy if for no reasons other than (1) to acknowledge and respect the trust their customers place in them, and (2) to recognize that they, too, must live in the same world that their products will shape, and will face the same harm as their fellow citizens would from inadequate privacy protections. Engineers should not divest themselves of responsibility for the societal consequences of the technology they create.

While there may be no absolute “right” answer in terms of how much privacy each of us should have and how that privacy should be preserved, we argue that it is unacceptable for engineers to take an agnostic view—either by choosing to ignore the effects of their technological designs or by simply remaining ill-informed as to the potential political, economic, and social effects of their products. Given their power as agents of change (a subject whose surface is merely scratched by this chapter), engineers have a responsibility to the rest of society.

In a liberal democratic society, social accountability with regard to privacy must be a part of technological development. Technologists must do their best to protect privacy—by maintaining familiarity with important policy decisions and ongoing court cases, learning to use the latest tools available, or building new ones themselves. These concerns are not just academic. Ignoring them can have devastating costs to business and society, and implementing them can yield enormous practical rewards.

Before You Get Started

Since Warren and Brandeis’ first, relatively short law review article appeared in the Harvard Law Review over a century ago, countless volumes have been written on the right to privacy. Dozens of privacy conferences convene around the world every year, each devoted to trying to understand this elusive right and how to best preserve it (see “Selected Privacy Conferences” in Chapter 12). This chapter can therefore hardly do justice to this ever-growing trove of privacy scholarship, but we hope it at least provides a high-level understanding of how technology and privacy interact and the important role technologists play in that nexus.

With some background on these issues now in mind, you can start thinking about how you might determine which privacy capabilities to use, and when. A series of basic questions about the technology you are trying to build will help get you started:

Does this technology interact with personally identified or identifiable information?: Define your data sets. If they contain personally identified or identifiable (PII) data, then you need to dig deeper into whether or not privacy-protective features should be incorporated into the design. As we’ll see in Chapter 2, PII is readily defined, but determining whether information is identifiable requires deeper analysis. Remember, users of a system are not operating in a vacuum—they exist in a world of data. Just because the data used by your product is not identifiable in itself does not mean users cannot still match that data with other data from outside the system, thereby rendering the data identifiable.²⁴
What is the technology supposed to do with the data?: As the product designer, you will of course already have this in mind. But since you are defining the parameters of your privacy analysis, it’s important to remember that you are building something that has a primary goal beyond—and most likely totally unrelated to—privacy. Just about anything that uses PII involves some conscious decision by users to provide personal information (read: give up some privacy) in order to receive some utility from the product.²⁵ Consequently, your concern about privacy should not be so absolute as to undermine this transaction by not providing the full utility expected by your end user. When starting design, it will prudent to think through the tradeoffs between privacy and other benefits, weighing where you should you set the dividing line and what should be the defaults.
What could the technology do with the data?: Once your product ships or is downloaded or otherwise gets into the hands of your customers, you lose some degree of control over it. They are going to use it for what it was designed to do, but they are not necessarily going to be constrained by the parameters of the product documentation. You have to consider these potential other uses and make sure you control for them as much as possible. Could the data that is collected or used by the capability be used to reveal sensitive information that users had no intention of exposing? Could different types of data be uploaded into the application and used in a privacy-threatening way? Never underestimate the creativity (and the tenacity) of talented technical people—if there’s an unconventional way to use your product, someone will find it. Try to think two moves ahead of them.
What are the potential privacy concerns?: Create a three-column chart. In the first column, list the potential functions of your product (both intended and unintended). In the second column, list all the potential privacy concerns raised by each function. In some cases, these concerns will track to particular laws or policies (e.g., European Union data protection laws). In other cases, the concerns will reflect other interests, such as your own organizational values, or the knowledge that consumers will respond negatively to certain consequences. Never discount your own instincts as to whether an outcome feels “creepy,” even if you can find no legal or other imperative that prohibits a particular usage of the product.
How can you configure your privacy building blocks to address those issues?: In the third column, find a privacy mitigation strategy for each privacy concern. As you think through this part of the framework, do not start with the technical solution; it’s almost impossible to design privacy protections that function entirely independently of human control. Instead, technical capabilities must support human-managed policy that is designed to protect privacy. Imagine the individual user or corporate or government privacy officer trying to use your product. How would they want to protect their privacy interests? What tools would they need in order to effectively manage their data and address privacy concerns? Are they more likely to want to establish rigid preventive measures to ensure data is never used in certain ways or are they more likely to use oversight mechanisms that discourage data misuse by ensuring accountability? Then fill in the third column with the technical building blocks that will enable this policy outcome.

Answering these questions should provide a basic framework for how your technology might interact with the rest of the world. These questions will also help you figure out who needs to see what data and when they need to see it as information is processed. This information will help you begin to sketch out the basic architectural framework upon which you can hang your privacy-enhancing features.

But don’t start building just yet. Next, start considering questions regarding the potential privacy implications of your technology:

Is it creating or storing new types of data that might expose new facets of an individual’s life?
Would exposure of this information cause embarrassment, lead to stigmatization of or discrimination against the individual, or even just inconvenience or annoyance?
Does the creation and/or use of this data change the balance of power between individuals, businesses, or governments?
Does this data fundamentally change how these interests interact in a way that creates advantages and/or disadvantages for any of them?

In addition to these basic questions, don’t dismiss your gut instinct as to whether the use of a new capability might be perceived as “creepy”—a standard that is largely undefinable yet often instantly recognizable.

If the answer to these questions is “yes” (or if you have that creepy feeling in your gut), then you need to make two fundamental decisions:

Should I build this? Do I believe the benefits of this technology outweigh the potential privacy risks it creates?
How do I build this in a way that mitigates those risks?

The first question is one you will need to determine yourself. The second is one we hope you can answer with the help of this book.

¹ Solove, Daniel J. Understanding Privacy. Cambridge, Mass.: Harvard University Press, 2008.

² Ibid.

³ In the United States, it was not until the 1960s, in Griswold v. Connecticut, that the U.S. Supreme Court identified a constitutional “right to privacy,” opining that “specific guarantees in the Bill of Rights have penumbras, formed by emanations from those guarantees that help give them life and substance,” and from those penumbras and emanations emerge “zones of privacy.”

⁴ Solove, Daniel J., and Marc Rotenberg. Information Privacy Law. New York: Aspen Publishers, 2003 (emphasis ours).

⁵ Informational privacy law has also developed in Europe and around the world in different ways. A comprehensive history of the concept would, of course, have to consider a broader global lens. However, this section is not meant to be a comprehensive history of the concept—merely one case study of the interaction of the technical and policy/legal worlds.

⁶ Brayer, Elizabeth. George Eastman: A Biography. Baltimore: Johns Hopkins University Press, 1996. 71.

⁷ Warren, Samuel D., and Louis D. Brandeis. “The Right to Privacy.” Harvard Law Review, 1890.

⁸ Ibid.

⁹ Ibid.

¹⁰ Katz established the “reasonable expectation of privacy” test to determine when the Fourth Amendment’s protections apply. In applying the test, a court must consider whether a person subjectively believed that a location or situation is private, and then it must determine whether this belief would be generally recognized by society. This test remains a cornerstone of Fourth Amendment jurisprudence, and a full discussion of its strengths and weaknesses is beyond the scope of this book. Suffice it to say that its application became increasingly more complicated as the amount of information in the world—and the means of storing and sharing it—has multiplied at an astounding rate.

¹¹ “The number of bank checks written, the number of college students, and the number of pieces of mail all nearly doubled; the number of income-tax returns quadrupled; and the number of Social Security payments increased by a factor of more than 35.” HEW Report, Chapter I, “Records and Record Keepers.”

¹² HEW later split into the Department of Health and Human Services and the Department of Education in 1979.

¹³ U.S. Dept. of Health, Education and Welfare, Secretary’s Advisory Committee on Automated Personal Data Systems, Records, Computers, and the Rights of Citizens, 1973. Preface.

¹⁴ It is important to note, however, that the FIPPs themselves are not legally binding, and the specifics of their incorporation into law and policy can vary from country to country and context to context. It’s best to think of the FIPPs as representing general themes of privacy law, while still looking to specific law and policy to understand the actual legal requirements and limits for data collection and use in a particular location and industry.

¹⁵ For example, a single email is subject to multiple legal standards. Law enforcement has different procedures for getting access to an email depending on whether it is a) intercepted in transit, b) accessed from a server before it is 180 days old, or c) accessed after it has been on the server for 180 days. This is in contrast to a standard “snail mail” hard copy physical letter, where law enforcement has to get a warrant to read the letter no matter how old it is or where it happens to be in the process of transmission. Many people do not realize this about email—they assume it should have roughly the same protections as a regular letter as they are functionally equivalent. See ECPA Reform: Why Now?.

¹⁶ EPIC. “State Privacy Laws”.

¹⁷ The term “data protection” is commonly used in Europe to describe policies and procedures that enable what we are referring to as “informational privacy” in this book. This is not meant to suggest a one-to-one correlation, as there are differences, and a deeper exploration of those nuances is beyond the scope of this book.

¹⁸ Kerr, Orin. “Can the Police Now Use Thermal Imaging Devices Without a Warrant? A Reexamination of Kyllo in Light of the Widespread Use of Infrared Temperature Sensors”. The Volokh Conspiracy. January 4, 2010.

¹⁹ Solove, Daniel J., and Woodrow Hartzog. “The FTC and the New Common Law of Privacy”. Columbia Law Review 114, no. 3 (2014).

²⁰ Mayer-Schönberger, Viktor, and Kenneth Cukier. Big Data: A Revolution That Will Transform How We Live, Work, and Think. London: John Murray, 2013. 9.

²¹ Greenwald, Glenn. “NSA Collecting Phone Records of Millions of Verizon Customers Daily”. The Guardian. June 6, 2013.

²² Greenwald, Glenn, Ewen MacAskill, and Laura Poitras. “Edward Snowden: The Whistleblower behind the NSA Surveillance Revelations”. The Guardian. June 11, 2013.

²³ Although Lessig sets his metaphor in terms of U.S. geography, his underlying point about the interaction between those who make policy and those who write code is universal.

²⁴ Preventing re-identification can be quite challenging, with some analysts and scholars suggesting re-identification will be more likely and normal than our current intuitions suggest. See, for example, Ohm, Paul. “Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization”. UCLA Law Review, 57 (2010): 1701.

²⁵ Obviously the transaction is not always so simple. On the one hand, law enforcement and intelligence systems, for example, do not always involve a known sharing of data about oneself, but—at least in a democratic society—popular control of the government means that at a societal level a decision was made to give these organizations the power to collect and utilize this information for the larger benefit of safety and order. On the other hand, users’ decisions to share data are not always so deliberate, as demonstrated by the large numbers of users who express shock upon learning that their online activities were even recorded, let alone stored, processed, and analyzed. Moreover, claims to consent and oversight can seem dubious when considering databases and data-collection activities whose very existence is secret. This can also occur in the private sector, as with Flash cookies, user-agent profiling, recording WiFi probe packets, and so on.

Get The Architecture of Privacy now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

The Architecture of Privacy by Courtney Bowman, Ari Gesher, John K Grant, Daniel Slate, Elissa Lerner