Privacy and Big Data

Chapter 4. The Players

Wherever you go, whatever you do, anywhere in this world, some “thing” is tracking you. Your laptop, and other personal devices, like an iPad, Smartphone, or Blackberry, all play a role, and contribute to building a very detailed dossier of your likes, concerns, preferred airlines, favorite vacation spots, how much money you spend, political affiliations, who you’re friends with, the magazines you subscribe to, the make and model of the car you drive, the kinds of foods you buy, the list goes on. There are now RFID chips in hotel towels and bathrobes to dissuade you from taking them with you while your in-room mini bar collects information about every item you’ve consumed (to ensure that it’s properly stocked for your next visit). That convenient E-ZPass not only makes your commute easier, but it also helps to provide an accurate picture of your whereabouts on any given day, at any given time, as do all the video cameras installed at ATMs, in stores, banks, and gas stations, on highways, and at traffic intersections. Your car collects information about you—from your location, speed, steering, brake use, and driving patterns. Although your home may be your castle, it is not, in the world we now live in, impenetrable. Google Maps provides a very accurate and detailed picture of it, and in the course of getting that picture, if you happened to have had an unencrypted Wi-Fi network, scooped up personal data as well. You may be aware of all the digital tracking that is going on by the Internet giants (Google, Facebook, and the rest), but with almost 40 percent of PCs worldwide infected with some form of malware that can gather information and send it back to their authors, that may be the least of your worries.

Some would say that we live in a dataveillance society, where our actions and communications are systematically monitored. Remarkably, that term was coined in1987 by Roger Clarke, long before personal devices and the Internet of Things came into being and more than two decades before the Wall Street Journal’s landmark series on “What they Know” which took an in depth look at the business of Internet spying. Surveillance, spying, eavesdropping, and tracking are words used to describe the constant monitoring of our lives by journalists, pundits, and authors (such as ourselves).

But does this truly characterize the world in which we live? Digitization, by its very nature, makes surveillance a permanent part of our lives. It is now very easy to passively (and automatically) collect data that documents the minutiae of our lives and, with or without our cooperation, store it. Our smart phones and other devices are more effective collectors of individual information than the KGB or Stasi, the most feared security agencies of the Cold War era. And this data we create is often copied and sold to third parties for unintended, at least from our perspective, purposes.

There are many players, including governments, involved in the ongoing privacy debate which at its core asks this question: is your personal information property or a basic human right? If it is property, than each one of us can trade it for all kinds of things and once we do, we cannot have an expectation that the information remains private. If it’s a basic human right, than we cannot negotiate it away. While the jury is still out, we believe that the answer will lie somewhere in between, informed by the politics of where you live (how’s that for clarity?).

Certainly, in the U.S., personal information is treated like property while other regions and countries regard it more as a constitutional right (see Chapter 2 and Chapter 3). To join Facebook, we provide personal information about ourselves, we invite family and friends to join, and in turn they provide their personal information. Then we share things we like, things we hate, things that make us sad or glad, all documented by our favorite photos. All of this stuff is digitally saved and associated specifically with us, representing a gold mine to private and public sectors; there is money to be made, behavior to be tracked, intellectual property to be protected, health risks to be monitored, political affiliations to be leveraged, and terrorist and criminals to be watched. The collection and use of our personal data was never just about advertising; it’s about everything.

The way in which we live has changed forever. Devices have made our lives easier and we now live online for work and for play. Traditional industries, like print media, and businesses, like Borders, have been replaced by electronic publishing, like Ebooks, and commerce sites, like Amazon. The Internet is here to stay, and technologically and socially speaking, it is a singular disruptive force—one which individuals, companies, and organizations of all kinds must reckon with.

As with all controversial and contentious issues, the privacy debate is chocked full of competing agendas. The players, in singular and as groups, all have stakes in privacy-related regulatory actions and how those actions are monitored and enforced. There are privacy watchdog groups seeking more comprehensive and restrictive privacy legislation; there are large collectors and holders of personal data who argue for self-regulation; and there are users of data who analyze it to discover all sorts of things (for the greater good, to make money, for national or commercial espionage, or simply to commit crimes). Countries and regions, and their respective governments, have their own agendas—some maintaining that privacy definitions should be more restrictive, giving individuals much more control over their information, while others seek to strip privacy from their citizens, constantly watching and monitoring to restrict speech, prevent uprisings, or identify potential terrorists or criminals. Even within a single government, it is not unusual to see competing visions of privacy from different branches or agencies.

We, or our personal information (however you might characterize it), are at the center of a privacy battle, caught in a tug-of-war between these various groups and players. After all, it is our information they are all fighting about. How we align ourselves with the various privacy movements depends heavily upon our worldview. But make no mistake, all the players have huge stakes in ensuring that our expectation of privacy aligns with theirs. To understand the underlying issues, one must first understand the motivations driving the players.

Meet the Players

While there are all kinds of privacy players weighing in on whether one should be fearful or sanguine about the state of privacy today, they can be categorized into four distinct groups:

Data Collectors. Whenever you use a personal digital device, such as a PC or cell phone (whether at work or at play), you generate data. You also generate data by intersecting with technology (that you don’t own but still collects data about you), like RFID tags, loyalty and credit card readers, and CCTV cameras (located in public and private spaces). All of this data is collected by someone (or multiple someones) for some purpose, with or without your consent. Most often, that information is then sold or rented to third parties or as data sets that can be combined (aggregated) with other data sets.
Data Markets (the Aggregators). Data markets are platforms where users (individuals, marketing organizations or others, companies, and government agencies) may search for specific data sets that meet their needs and then download them (either free or for a fee, depending on the data set).
Data Users. These are the people or organizations that buy or get free access to data, usually through applications (social media monitoring, retail analytics, customer relationship management, demand monitoring, inventory management, etc.). For example, if you’ve ever looked someone (or yourself) up on Spokeo, you are working with a number of data sets that have been combined by Spokeo to provide that profile.
Data Monitors/Protectors. There are a host of agencies and organizations that monitor privacy issues from various points of view and others that are involved in self-regulatory policies and enforcement for various industries or functions.

For each of these groups, the intrinsic value that our personal information represents may be quite different but in all cases, it is significant. Certainly, any company or person engaged in advertising would attest to that. But while advertising may be at the center of the privacy debate, there are all kinds of players who derive considerable value from our personal information. That being the case, online advertising has pushed the technology envelope for creating and maintaining detailed digital profiles (behavioral and otherwise) of each and every one of us.

A (Very) Brief History of Online Advertising

Online advertising, although very different in terms of scope and types, follows the same business model as offline advertising. It is quite simple: “... consumers are paid with content and services to receive advertising messages and advertisers pay to send these messages.”^[29] Theoretically, any website delivers content and services in some form or another, so any website (and the company or individual behind it) could include paid advertising on its pages and profit from it. For advertisers, this represents a new frontier to explore with costs that are far less than traditional venues, along with the intriguing possibility of gathering much more information about their targeted audiences which translates into more effective advertising and ultimately, more sales.

It all began in 1994, when HotWired, the first commercial web magazine, displayed an AT&T banner ad which was the first of its kind. Until then, advertising was limited to offline publications, such as magazines and newspapers, store displays, product packaging, television, radio, telephone, and of course, the universally despised direct mail pieces that filled up all of our mailboxes. At that point, most of the web consisted of static content and was viewed as just another (new) advertising channel.

AT&T bought the ad based on the number of impressions (as in the number of individuals who viewed the ad). This quickly evolved into the cost per 1,000 views (CPM) and then in 1996, Proctor & Gamble negotiated a deal with Yahoo! where ads would be paid on a cost-per-click (CPC) basis. This is very similar to the payment method employed by direct marketing houses and telemarketing organizations. Up until 2008, this was the standard way in which online advertising fees were based.^[30]

Thus, a very large and profitable cottage industry was born, intent on creating and using technology to track (and record) how individuals navigated the web, what they looked at, and what they bought as well as a plethora of companies that operated on the publishing or advertising sides to facilitate the creation, placement, and tracking (including ROI) of ads based on targeted profiles. At the forefront of all of this was tracking technology (simple by today’s standards) designed to follow the user around and essentially, document his (or her) every digital move. Today, ETags, cookies, flash cookies, beacons, supercookies, and history stealing are all employed to track and collect what you do online from any device by the sites you visit and from unknown third parties (one step removed, such as an ad on a page that you visit). Location tracking, where geo-location data generated from cell phones can be used to triangulate your location at any given time, is also on the rise as is its use in targeted mobile advertising. All of this information can be tied directly back to you; the days of assuming anonymity on the web are over.

From an advertising perspective, the Internet is unique in the amount of information it can generate and collect about individuals and groups of individuals which leads to higher quality and more targeted segmentation (creating a target audience based on a selected set of variables). The key here is that the more you are able to accurately target a prospective buyer in terms of his behavior (i.e., behavioral advertising) the better your return (click through and conversion rates go up). Added to that, technology has advanced to such an extent that ads can be placed in near real-time. For example, if you are on an orthopedic site reading about a specific knee brace and then go to another website, a banner ad pops up with an ad for knee braces (just seconds later). Or you receive a 50 percent off coupon on your mobile device for the restaurant you are walking by. Finally, traditional offline advertising mediums are either dying (print magazines and newspapers) or waning (network television and radio) as consumers increasingly favor digital and streaming media as well as hanging out on social media sites, all premier publishing venues for advertisers.

From a publishing perspective, pretty much any site can incorporate advertising into its business model as a revenue channel. The rise of advertising intermediaries—those companies that broker ad buys or placements for a fee (from either the advertisers or publishers perspective) and networks that do the same but work at an aggregated level—have made it possible for even small businesses, organizations, and blogs to create a substantial advertising revenue stream. Of course, this entire ecosystem is supported by the data suppliers and markets that deliver detailed information about each one of us. It goes without saying (but we will) that their most important asset is everything they individually and collectively know about us.

However you cut it, online advertising (made up of search, banner and video ads, classifieds, rich media, lead generation, sponsorships, and email) is big business. A recent forecast, courtesy of eMarketer, estimates that online ad spending will reach $31.3 billion in 2011 (with Google taking the lion’s share), up 20 percent from last year, and is projected to reach $50 billion by 2015. It accounts for nearly 20 percent of the major media dollar spend in the U.S. this year and is expected to make up almost 28 percent of the total spend by 2015. So while Google, Twitter, Facebook, LinkedIn, Yahoo, and Foursquare may offer diverse services to their users, their revenue and valuations are driven by the data they collect and the multi-billion dollar value it represents to advertisers.

When it comes to advertising and privacy, it could be argued that advertisers don’t care about what we do or where we go, they do not act as moral arbiters on our lives. They are simply interested in one thing: getting us to buy what they’re selling. If a privacy debate is framed with this in mind, it is easy to say that the impact is rather benign: what are a few more targeted ads delivered to your mobile when you’re near a particular store or served up to you when you’re surfing for a specific item? What’s the harm in trading your information for a specific service? Here’s the rub: the information collected about us is not just used by advertisers to sell stuff to us. It’s used for myriad purposes, none of which we have control over.

Intellectual Property Rights, Trusted Computing, and Privacy

While the advertising industry is interested in collecting data to get us to buy something, certain sectors heavily dependent on the protection of intellectual property (IP), like the recording, publishing, television, movie, and software industries, are far more interested in what we do with the items after we buy them. As we’ve said before, the digital age has been a disruptive force and while the advertising industry has leveraged it to vigorously pursue their business model, other industries have watched theirs crumble.

The free and easy digitization of all kinds of “property” whether it is music, movies, books, or video, has caused some powerful groups, like the Recording Industry Association of America (RIAA), the Motion Picture Association of America (MPAA), and their counterparts in the EU, to advocate for technology that protects their merchandise which strikes at the heart of the privacy debate. Essentially, they want to monitor, control, and in some cases, remove or delete their products on any devices we own to protect their intellectual property rights. To do this, they are enlisting the support of other parties, including hardware manufacturers, Internet Service Providers, the legislature, and law enforcement agencies. Most disturbing for privacy advocates, almost all of their approaches require the ability to uniquely identify and associate digital devices and their uses with its owner. We don’t know about you, but we find this troubling at best and far more intrusive than anything the advertising industry has come up with. What exactly does purchasing and downloading, say an online book, really mean if someone can take it away from us without our permission?

Intellectual property law has been around, in some form or another, since the 1500’s and unlike privacy, is explicitly granted in the U.S. Constitution (Article 1, Section 8, Clause 8). It is designed to grant owners of intangible assets, like musical, literary, and artistic works, or discoveries and inventions, certain exclusive rights to that asset. In the U.S., there are laws and certain protections given to IP which can be broadly categorized into the following:

Trade secrets are information that companies keep secret to gain a competitive advantage over others.
Copyrights are sets of rights granted to the creator of an original work for a limited (although historically, constantly expanding) period of time in exchange for public disclosure of the work. The “work” could be a book, play, sheet music, painting, photograph, sound recording, movie, or a computer program. The “rights” include the right to copy, distribute and adapt the work. There is also the Fair Use doctrine which allows for the reproduction of a work if it is used in commentary, critical reviews, research, parody, news reporting, etc.
Patents are similar to copyrights in that they are a set of rights granted to an inventor for a period of time in exchange for public disclosure of the invention.

Essentially, intellectual property laws protect intangible assets from theft or piracy and certainly the digital age has brought into question just what those terms mean. The advent of file sharing, where music, video, movies, and documents are easily shared between devices, propelled by the rise (and then fall) of first generation peer-to-peer sharing networks, like Napster, Grokster, and Madster, as well as the continued success of BitTorrent-based sites like The Pirate Bay has brought two ideologies into direct conflict: those who want to protect IP at all costs versus those who argue that the Internet by definition is designed to easily facilitate the sharing (as well as copying) of information between users.

To protect their IP, the recording industry went after Napster and Grokster as well as all commercial peer-to-peer networks. In 2005, it won a significant round as the Supreme Court found that “...file-sharing networks that intentionally profited by illegal distribution of music could be held liable for their actions.”^[31] This ruling caused most peer-to-peer networks to shut down or work out some sort of legal distribution agreement (for example, Apple’s iTunes business model) with the record companies.

Of course, as in most things, money, specifically the perceived economic loss incurred by the sharing and copying of “pirated” files, formed the heart of this conflict. One of the more famous figures bandied about is “that 750,000 jobs and up to $250 billion a year could be lost in the U.S. economy thanks to IP infringement.”^[32] These statistics and others, like the Business Software Alliance’s estimate that the U.S. piracy rate for business software is 20 percent or $9 billion in 2008 and the Motion Picture Association’s estimate that its studios lost $6.1 billion to piracy in 2005, have been debunked by none other than the Government Accountability Office who came to this conclusion:

“While experts and literature we reviewed provided different examples of effects on the U.S. economy, most observed that despite significant efforts, it is difficult, if not impossible, to quantify the net effect of counterfeiting and piracy on the economy as a whole... To determine the net effect, any positive effects of counterfeiting and piracy on the economy should be considered, as well as the negative effects.”^[33]

While estimates of the economic damage caused by piracy and counterfeiting were, and are, questionable, it did not stop these industries from lobbying and receiving legal digital copyright protection. The Digital Millennium Copyright Act (DMCA, 1998) and the European Union Copyright Directive (EUCD, 2001) implemented the World Intellectual Property Organization’s (WIPO) Copyright Treaty (1996) that specified the following:

It was illegal to circumvent copyright technology (otherwise known as Digital Rights Management) designed to protect materials. In other words, you (as in each and everyone one of us) cannot use available tools to “break” copyright and share and copy files, etc.
Internet Service Providers (ISPs) are required to take down the hosted sites of copyright infringers once they are made aware of the problem. In others words, ISPs are the copyright police and the tops ones are already on board, including Comcast, Cablevision, Verizon, and Time Warner Cable.

The concept of digital rights management (DRM) is quite simple: it is technology used by IP holders (publishers, hardware and software manufacturers, etc.) to control access to their copyright materials on digital devices. Out of the desire for an advanced DRM system, Microsoft, Intel, IBM, HP, and AMD got together and formed the Trusted Computing Group (TCG). The goal of this group was to develop a standard for a more secure PC or as Ross Anderson puts it:

“[It] provides a computing platform on which you can’t tamper with the application software, and where these applications can communicate securely with their authors and with each other. The original motivation was digital rights management (DRM): Disney will be able to sell you DVDs that will decrypt and run on a TC platform, but which you won’t be able to copy. The music industry will be able to sell you music downloads that you won’t be able to swap. They will be able to sell you CDs that you’ll only be able to play three times, or only on your birthday. All sorts of new marketing possibilities will open up.”^[34]

The TCG introduced the Trusted Platform Module (TPM—yes, we love acronyms as much as you do!) which is a specification and also a modifier for hardware that implements that specification, like the TPM chip. You may also know the chip by another name, the Fritz chip. It was named in honor of Senator Fritz Hollings of South Carolina who wanted the chip in pretty much every device we own:

“Hollings’ bill... would require any device that can ‘retrieve or access copyrighted works in digital form’ to include a federally mandated copy protection system... That covers not just your next iPod or Windows Media Player, but just about every digital device with a screen, a printer, an audio jack, a disk drive, a memory stick, or several input/output devices yet to be invented. Your computer, your camera, your car stereo.”^[35]

The TCG, TPM, and Fritz chip have all been met with a great deal of controversy. Besides the concern that TPM and the chip (which is now used by almost all the notable PC and Notebook manufacturers) would cause consumers to lose all anonymity in online transactions, some have argued that the TCG members have, by virtue of their alliance, made themselves far more powerful to the point of monopolistic and unfair business practices. Others have pointed out that whoever controls the infrastructure becomes a single, powerful point of control. Anderson equates this to making everyone use the same bank, lawyer, or accountant.^[36] There is also the issue of remote censorship as digital objects can be easily removed from any device—without users’ permission. Amazon certainly illustrated the power of TPM when it removed two of George Orwell’s books, Animal Farm and 1984, from customers’ Kindles:

“This is precisely the functional equivalent of Barnes & Noble -- or Amazon itself for that matter -- using a crowbar or lock pick to break into your home or business, then stealing back a previous physical book purchase, replacing it with the equivalent value in cash,” said privacy advocate Lauren Weinstein.^[37]

Outside of the monumental potential for abuse by large corporations and governments (and all of their agencies), when fully implemented, TPM essentially strips us of any expectation of privacy or anonymity. It uniquely identifies every device, ties that device to its owner(s), and then monitors and reports back on what is read, written, watched, or listened to. Do we really want any government or corporation to be able to easily monitor and decide what we should or should not be reading or what software we are allowed to use on our devices? Now, although the Fritz chip can be found in most PCs, it has been banned in China. For some odd reason, the Chinese have no interest in giving U.S. corporations (or their government) the ability to turn off the operating system for every computer connected to the Internet.

While the EU is pushing for a right to be forgotten, it appears that powerful IP players are pushing for the right to know everything about what we do with their IP. Historically, this has never been the intent of copyright law, as Marc Rotenberg noted in a Senate Subcommittee hearing way back in 1998:

“Traditionally, copyright law has not posed a particular problem for privacy protection. Readers, listeners and viewers have always enjoyed very high levels of privacy, by practice if not by law, without any threat to the interests of copyright holders. Copyright grants certain rights to copyright holders, but these rights do not include the right to know the identity of the copyright user.”^[38]

Better yet, it appears that we no longer own our digital assets as copyright holders, or those acting in their interests, can remove it. Instead, we are merely renters but with far fewer rights than renters of physical property.

It is interesting to note that while individuals and their use of IP assets have been under tremendous scrutiny, a far more insidious threat has been, until recently, flying under the radar: the theft of patents, trade secrets, and copyright assets due to cyber attacks whose scope, sophistication, and targets suggest that this is the work of nation states, not individual hackers. A recent study by McAfee uncovered that the networks of “72 organizations including the United Nations, governments and companies around the world had been infiltrated”^[39] which is considered to be “... the biggest transfer of wealth in terms of intellectual property in history.”^[40] Our point is this: instead of focusing on individuals, perhaps the government, large IP holders, their assorted lobbyists, and industry groups should be turning their attention to this much larger threat to intellectual property.

Pushing the Privacy Envelope All the Way to the Bank

While the IP stakeholders have been busy redefining “privacy” for their own ends, Google, Yahoo, Facebook, and others have been equally busy making billions of dollars collecting our data and using it for targeted advertising. Of course, any company or organization that collects data can offer it for sale or free. Certainly, federal and state agencies, in their move toward a more open and transparent government, have made many comprehensive data sets available for public use (ranging from census to weather to loan and property information). Sometimes this data is sold, as illustrated by the state of Florida who made $63 million dollars last year by selling DMV information (including name, date of birth, and type of vehicle driven) to companies like Lexus Nexus and Shadow Soft. Sometimes this data is made available due to state transparency laws, as illustrated by Florida’s public display of mug shots on government websites. This action formed the basis of several businesses, the most lucrative being one that helps you remove your arrest information from public websites.

It is, however, the big companies and their ecosystems that are most responsible for pushing the privacy envelope and most of them are involved, in one way or another, with social networking (like Google, Yahoo, Facebook, Twitter, Groupon, LinkedIn, and Zynga). These companies are in the primary business of data: collecting, sharing, and even selling users’ information. Most of them, unsurprisingly, have also pushed the privacy envelope:

“... in the past year, Silicon Valley firms have seen a bevy of Web companies like them swept into investigations for consumer protection violations and fraud... Last week, Internet radio site Pandora revealed that it was called into a broad federal grand jury investigation into the alleged illegal sharing of user data by a number of companies that create apps for iPhone and Android devices. Days earlier, Google settled with the Federal Trade Commission on charges it exposed data through its Buzz social networking application without the permission of users. Last year, Twitter settled with the agency after an investigation found the site’s loose security allowed hackers to access user information.” ^[41]

These giants are the owners of a treasure trove of personal information. How did they collect it? Much of it was (and continues to be), given away by individuals in the course of interacting with each other on these sites. Some of it was obtained through the employment of digital tracking, known or not, as shown by the Wall Street Journal who conducted an experiment to explore the use of digital tracking. In this experiment, the Journal discovered that 50 of the most popular sites (representing 40 percent of all web pages viewed by Americans) placed a total of 3,180 tracking devices on the Journal’s test computer (in a simulation of a normal user surfing sites, buying goods, and interacting with others via social networks). Most of these devices were unknown to the user.^[42] Additionally, the Journal discovered that Flash cookies “... can also be used by data collectors to re-install regular cookies that a user has deleted. This can circumvent a user’s attempt to avoid being tracked online.”^[43] A Stanford study also found that half of the 64 online advertising companies (including Goggle, Yahoo, AOL, and Microsoft) it examined continued tracking even when do not track options were activated. So even if we, as users, are proactive about privacy it does not follow that the collectors will be prevented from gathering information about us.

Adherence to privacy policies and practices are also under fire. Facebook is well known for its questionable privacy policies which are based on defaulting user sharing to a public option (although Google’s launch of Google+, which defaults user sharing to a private option, has caused Facebook to change some of its practices). Last year, it was revealed that third party Facebook applications were able to collect personally identifiable information for Facebook users, specifically user ID numbers that could “...be used to look up the user’s real name and sometimes other information users have made public, and potentially tie it to their activity inside the apps.”^[44] This year Facebook renewed concerns about its privacy policies with the release of its Tag Suggestions feature based on facial recognition technology that would enable its users to more easily identify and label friends and acquaintances who appear in their posted photos. The reason for concern is that this feature was offered as an opt out, meaning that users would have to manually go into their settings and turn it off.

The issue of opting out versus opting in for any feature or service is central to the privacy debate. Proponents argue that the use of opt out erodes users’ privacy without their knowledge and enables collectors to collect even more personal information, pushing the privacy envelope even further. So what privacy concerns you might have had prior to the introduction of a new feature or service, becomes a social norm because “everyone is doing it.” Most collectors caught using opt out argue that the loss of privacy was accidental as there was no intent to invade one’s privacy but merely to make it easier to adopt a new great feature. In our view, that is a dubious claim at best given the clear economic benefits to the collector.

Should you think this type of behavior is limited to Facebook, think again. After going public, LinkedIn announced in its blog that it would allow advertisers to include us in ads if we recommended their product or service. Specifically, buried within each member’s account settings profile page it described its new service this way:

“LinkedIn may sometimes pair an advertiser’s message with social content from LinkedIn’s network in order to make the ad more relevant. When LinkedIn members recommend people and services, follow companies, or take other actions, their name/photo may show up in related ads shown to you. Conversely, when you take these actions on LinkedIn, your name/photo may show up in related ads shown to LinkedIn members. By providing social context, we make it easy for our members to learn about products and services that the LinkedIn network is interacting with.”

This feature was already turned on for all the LinkedIn members and required opting out to turn it off. More importantly, most members were unaware of this change until it was reported by the media. Groupon also recently announced a significant change in its privacy policy in an email to all its users. It will now be collecting more information about its users to share with partners and using geo-location information to market to them.

If your business model is predicated on the collection and ownership of personal information that results in a revenue stream worth billions of dollars, it stands to reason that you view this information as your property. As such, you can buy it, rent it, and sell it. It is also in your best interest to discover, via trackers, services, or new features, as much as you can about the people who use your site. After all, the more targeted the profile the more valuable it becomes to the advertiser. For these companies, stringent privacy regulations would curb their ability to make money and in their words, “deprive consumers from advertisers’ abilities to serve up more relevant ads.” This is certainly the case lobbyists are making for Google (spending $5.2 million in 2010), Yahoo ($2.2. million), Apple ($1.2 million) and Facebook ($350,000). While the first part of this argument makes economic sense, it is disingenuous to suggest that consumers, who have indicated through a number of surveys their growing disapproval (up to 86 percent) of tailored advertising, would feel deprived by less invasive advertising.

Unprecedented Access Further Erodes Privacy Expectations

From a privacy point of view, data markets and aggregators follow the collectors’ play book. They also offer a treasure trove of personal information, but often on a much larger scale since they function as marketplace platforms where anyone can search and find data sets to fill almost any need while companies, organizations, and individuals can offer up their data sets for sale.

Data markets can be quite specific (focusing on a single market, for example), or quite broad (multiple subjects or audiences, sets of tracked behaviors or other variables, with historical performance and indicators, as examples). Thomson Reuters offers data about the risk of doing business with an individual or a company; InfoChimps offers a broad range of data sets; Gnip is focused on social media feeds; Microsoft Azure offers data sets and services oriented toward (and from) its customers and partners; and Neilsen offers data sets and services oriented towards the industries it covers (media and entertainment, consumer packaged goods and retail, and telecom). There are also data markets that specialize in advertisers’ interests, like Rapleaf, Acxiom, ChoicePoint (now Reed Elsevier), Quantcast, and BluKai, who provide targeted user profiles (including email addresses, resident addresses, names, income, social networks, and much more) through the aggregation of many data sets and the use of tracking devices. These are just a few of the markets out there: there are hundreds of companies that operate as middlemen for all kinds of data categories, many offering analytics and other services to help transform the data into information that can be acted upon (whether the action is to place an ad, determine the whereabouts of a person, ask for a charitable donation, predict where a crime may occur, or identify protestors in a march).

Just like the collectors, these players have little interest in comprehensive privacy regulations or guidelines as it does not serve their business models. For example, last year Rapleaf came under fire for linking user names and email addresses to specific social networking profiles and then selling that information to third parties. And, like Facebook, this was not the first time that Rapleaf was accused of privacy violations:

“In 2007, CNET reported that the company operated two other subsidiaries that secretly shared information with one another to create extremely detailed profiles about users -- including their social network affiliations. Rapleaf quickly responded by merging all of its businesses under one brand.”^[45]

But with data markets, privacy violations are not always willful or known. The number of data sets available for purchase or free are growing as fast as the underlying data that drives them. The markets make it easy to find and buy any number of data sets, which sets in motion the leakage of private information, as evidenced by a recent AT&T Research study of 120 of the most popular Internet sites which found that:

“... fully 56 percent of the sites directly leak pieces of private information with this result growing to 75 percent if we also include leakage of a site userid. Sensitive search strings sent to healthcare Web sites and travel itineraries on flight reservation sites are leaked in 9 of the top 10 sites studied for each category.”^[46]

In a previous study, AT&T Research looked at how third parties can link personally identifiable information (PII) that is leaked by social networks with other user actions on that site and on other sites.^[47] In other words, it is fairly easy to collect private information and link them to a specific individual as the study points out:

“A well-known result in linking pieces of PII is that most Americans (87 percent) can be uniquely identified from a birth date, five-digit zip code, and gender.”^[48]

The emergence of new data analysis systems known collectively as “big data” have dramatically lowered the cost of merging and analyzing large data sets. These big data systems, including Hadoop, S4, CloudEra, StreamInsight, BackType (recently purchased by Twitter) and our own PatternBuilders Analytics Framework, make it relatively easy for companies and individuals to find, buy, and aggregate any number of data sets from any number of data markets, which, in turn, makes it that much easier to derive private information.

The more data you are able to collect and connect to other data sets, the easier it is to obtain what was thought to be private information and then tie that information to a specific individual. Once that link is made, you are then able to build a detailed profile, with multiple data sets providing you with more and more information. Certainly, a recent announcement by advertising giant WPP PLC about the launch of a new company, Xaxis, may give many privacy advocates pause (it certainly did us) as it will manage:

“... the world’s largest database of profiles of individuals that includes demographic, financial, purchase, geographic and other information collected from their Web activities and brick-and-mortar transactions... WPP executives say Xaxis will have more than 500 million unique profiles, reaching virtually 100 percent of the population in markets where it operates.”^[49]

While WPP also points out that all this information is anonymous and that they will self-regulate, this much information about any one profile would make it fairly easy to attach a name, email address, and other personal identifiers.

In essence, data markets act as third party brokers of data sets. What happens after those sets are purchased and then used for any number of business or research purposes, is unknown. While the markets may be subject to ubiquitous privacy policies of the collectors and have their own privacy policies, it is not clear how data usage can be policed and enforced once the data changes hands. Anonymity may be promised, but as famously demonstrated by Netflix’s research contest, through big data it is often easily broken.

Letting the Genie Out of the Bottle

Much of the privacy debate focuses on the data, its collectors and markets. We are bombarded with information (as we have just shown) on how easy it is to track and collect data about us but equally important is the fact that, outside of advertising, there is a great deal unknown about how our data is, and could be, used. And while we understand what advertisers are doing with it, as Jeff Jonas said in a recent interview:

“The truth about data is once it’s out there, it’s hard to control.”^[50]

And there have been plenty of examples of the many ways in which our data is used—with or without our knowledge:

The creepy application that takes location information from every Tweet (for a Twitter user) or uploaded photo (for a Flickr user) and plots it on a map, which reveals hot spots around users’ homes, workplaces, and other places they routinely visit.
U.S. law enforcement officials that use GPS technology to track criminal suspects and parolees without their knowledge and without meeting the standards of wiretap laws or other laws regulating electronic surveillance because they “do not record conversations.”^[51]
The iPhone Tracker, developed by Peter Warden and Alasdair Allan, to show how the iPhone’s unencrypted location history file (that holds more than a year of location data) can be used to provide a detailed picture of wherever you have been.
The use of network analysis software employed by the Richmond, Virginia police department to analyze “... the social networks around suspects, such as dealings with employers, collection agencies, and the Department of Motor Vehicles. The goal... is to pull together a complete picture of suspects and their social circle.”^[52]
The use of facial recognition technology to quickly find terrorists and criminals using digital surveillance photos.
The use of handheld facial recognition devices by more than a dozen law enforcement agencies where an officer can “snap a picture of a face from up to five feet away, or scan a person’s irises from up to six inches away, and do an immediate search to see if there is a match with a database of people with criminal records.”^[53]

As shown in the preceding examples, it’s clear that various government agencies have taken a page or two out of advertising’s playbook, increasing their data collection efforts through collaboration with other agencies as well as third parties (commercial data collectors and markets), building large databases, and engaging in sophisticated data mining efforts. Some of these projects focus on operational efficiency (the Department of Veteran Affairs), others on fraud detection (Medicare and the IRS), criminal investigation (fusion centers), crime prevention, and counterterrorism. ^[54]

Certainly, various government agencies are intent on creating some very large databases:

The FBI’s Investigative Data Warehouse which houses operational and intelligence information from more than 53 data sources and holds, as of September 2008, nearly one billion documents.^[55]
The National Security Agency’s secret collection of “... phone call records of tens of millions of Americans, using data provided by AT&T, Verizon, and BellSouth.”^[56]
The Transportation Security Agency’s possible resurrection of an airline passenger profiling system (similar to the now defunct and controversial Computer Aided Passenger Pre-Screening System) that would most likely rely on commercial data sources and PNRs (passenger travel records)^[57] that reveal a great deal of personal information about a passenger (flights, hotel and rental car reservations, meal preferences, emergency contacts, special room requests, notes about tastes and preferences, the list really does go on and on).

Much of the efforts are focused, of course, on pulling together information from multiple data sources:

“The 2004 GAO report on government data mining found that more than one-fourth of all government data mining projects involved access to data from the private sector. The government has broad powers for doing so. It can access publicly available data on the same basis as any member of the public, it can contract for data, and it can exercise its unique power to issue subpoenas, search warrants, wiretap orders, National Security Letters, and FISA orders that require the product of personal data, usually in secret.”^[58]

It’s clear that federal agencies are increasingly delving into “... the vast commercial market for consumer information, such as buying habits and financial records, ... tapping into data that would be difficult for the government to accumulate but that has become a booming business for private companies.”^[59]

However you may choose to characterize the U.S. privacy regulatory landscape, some effort has been, and is being, made to regulate privacy as it applies to the commercial sector. But our various government agencies are not held to those same regulatory standards as first shown by Miller v. United States:

“In Miller v. United States and subsequent cases, the Supreme Court created a broad gap in the privacy protection provided by the Fourth Amendment by finding that the government’s seizure of personal information from third parties is outside its scope. As a result, the government’s behavior need not be reasonable nor is any judicial authorization required when the government searches or seizes personal information held by third parties.”^[60]

Privacy advocates or not, all of us should be troubled by the lack of privacy protections afforded us by government agencies. They are the beneficiaries of large troves of valuable personal information and yet, are not subject to any regulations regarding the usage of it. How do we ensure that the regulators are in fact, subject to the same privacy policies and laws?

Once we give out personal information, it is out of our control; it can be collected, stored, sold, rented, used, and analyzed for any number of purposes. If we don’t know what will happen to our data as it passes through any number of hands, than how can we make any decision about what to give away and what to keep? Do we assume, as Danah Boyd recommended (see Chapter 2), that our online behavior is public by default and private by effort? Do we rely upon privacy watchdog groups, consumers, and the various regulatory agencies to monitor and identify the most egregious privacy violations to keep the collectors, markets, and applications providers (and users) honest? Do we lobby our legislature for more comprehensive privacy policies that apply to all government agencies? Some argue that privacy, as we knew it, no longer exists. But, throughout history (ours and the rest of the world’s), someone has always argued that very point. Perhaps we should say instead, that privacy must be redefined within the framework of our digital world.

Those that Protect and Serve in the Name of Privacy

While U.S. companies and various government agencies (ours and others around the world) are some of the largest collectors of personal information in the world, there are no comprehensive U.S. privacy regulations. Most privacy actions are handled in state courts via tort law where actual harm must be shown unlike the EU and other regions that have comprehensive privacy regulations and a very different view of privacy. As a result, much of U.S. privacy policing is accomplished through watchdog organizations, regulatory agencies (see Chapter 3), and industry self-regulatory agencies. And of course, blogs and the media can often be counted on to discover and report issues and violations.

There are a number of privacy organizations, like the Electronic Privacy Information Center (EPIC), American Civil Liberties Union (ALCU), and Electronic Frontier Foundation. These organizations cover privacy policy as it applies to children, smartphones, the government, social networks, and a host of others. They institute and/or cover legal privacy-related court cases, provide research and analysis, and give guidance on how to prevent privacy violations. For a comprehensive list of U.S. and international organizations and what they cover, go to EPIC’s^[61] online guide to privacy resources.

There are alliances and organizations, largely promoted by the advertising industry and data collectors in a bid to prevent more privacy legislation, which provide regulation guidelines and certify those companies that adhere to them. A coalition of trade groups recently announced the Digital Advertising Alliance, a program designed to self-regulate digital tracking practices and offer consumers who do not wish to be tracked an opt out icon. Participating companies include major advertisers, like AT&T, Verizon, Dell, and Bank of America, and major ad networks, like AOL, Google, Microsoft, and Yahoo. However, according to ComScore, of the top 200 advertisers, 181 are not taking part in this program.

The Network Advertising Alliance is an association of advertising networks, data exchanges, and marketing analytics services providers. It educates consumers on how they can protect themselves online, provides information on what is being tracked, and offers consumers a way to opt out of participating members behavioral advertising programs. There is also a Self-Regulatory Program for Online Behavioral Targeting, launched by some of the largest media and marketing associations (that represent more than 5,000 companies that advertise on the web). It features an Advertising Option Icon that alerts consumers if the site is engaged in behavioral advertising and offers an easy opt out.

As with any other part of the privacy landscape, the number of organizations that monitor various aspects of privacy as it applies to industries, topics, or issues, is vast. The cynics among us (and we are a part of this group) might be asking this question: if much of privacy enforcement happens after a violation, how many violations go unreported? In other words, how do we protect ourselves if we don’t know what we’re protecting ourselves from? Whenever there is a vacuum, something rushes to fill it up as evidenced by the number of companies offering consumers’ privacy solutions and providing businesses and organizations ways to certify that they meet privacy guidelines in the U.S. and abroad.

The Rising Privacy Economy

While the erosion of privacy may be big business, there are all kinds of companies rising to a different challenge: preserving, or often redefining, consumer privacy to better fit the digital framework in which we live. There are companies that help businesses and organizations certify they meet specific privacy standards. There are companies and a host of tools that help consumers block tracking and monitoring. There are companies that help consumers control their personal data in a number of new and interesting ways and there are movements intent on recasting privacy in the digital age.

Over the years, studies have shown that consumers are becoming more concerned about Internet privacy but, as a recent Harris Interactive Poll highlights, they are now also assuming responsibility for it: 92 percent indicating that they have some responsibility for protecting their data, a majority expecting organizations to assume responsibility, and 42 percent indicating that they trust themselves most to protect their privacy.^[62] They are also willing to pay for it. Recent research from Carnegie Mellon University indicates that privacy may very well be a key competitive advantage for companies in the digital age:

“Our results offer new insight into consumers’ valuations of personal data and provide evidence that privacy information affects online shopping decision making. We found that participants provided with salient privacy information took that information into consideration making purchases from websites offering medium or high levels of privacy. Our results indicate that, contrary to the common view that consumers are unlikely to pay for privacy, consumers may be willing to pay a premium for privacy. Our results also indicate that business may use technological means to showcase their privacy-friendly privacy policies and thereby gain a competitive advantage.”^[63]

The success of online privacy solutions providers, such as TRUSTe, who has certified over 4,000 web properties, as well as the advertising industries focus on offering consumers tracking opt outs and more transparency on what is being tracked, certainly attest to an increasing focus on privacy from both a business and consumer perspective. Companies, like Google (via its Me on the Web and Ad Preferences tools) and RapLeaf (via its See Your Info page), are also providing information on what is being tracked as well as offering consumers the ability to edit information or opt out.

For savvy users, there are a host of tools available that block cookies, ensure email and file privacy, enable anonymous surfing, etc. For most, however, the sheer number of tools that may be employed for various aspects of privacy is daunting. But, just like antivirus software, it is likely that there will be privacy versions that combine these tools (or something like them) into one easy solution.

There are also new privacy business models that allow consumers to find out what information is available about them, repair false information, and even determine what information they wish to share with advertisers. MyPrivacy removes personal information from websites. MyReputation monitors your online presence and customizes it to present you or your business in the best possible light. SafetyWeb protects your child’s online reputation and privacy. Singly allows people to aggregate and own their personal data through digital lockers.

Privacy by Design (PbD), a framework for building privacy into products or services developed by Ann Cavoukian, Canada’s Information and Privacy Commissioner, is taking center stage as it forms the basis of the FTC’s proposed framework for businesses and policymakers. There is also a Silicon Valley trade group, the Personal Data Ecosystem Consortium which promotes and supports the idea that individual control their own data through the use of personal data stores and services as well as organizations, like the International Association of Privacy Professionals (iapp), focused on developing and supporting privacy professionals throughout the world.

There are many pundits who argue that privacy is dead but the desire for privacy certainly is not. While the emerging privacy ecosystem can help consumers regain some control over their personal information, they must do so within a digital framework. One way or another, a new era of privacy is upon us.

While the Players are Playing, Consumer Privacy Continues to Erode

Our personal information is used to feed all kinds of business models. In fact, we are the fuel for what has become a multi-billion dollar economic engine. And once the data leaves collectors’ hands, we, the consumers, have absolutely no control over who uses it or for what purpose. Technology marches on—in the form of big data storage, access, and analytics, new and improved devices, embedded TPM chips, RFID tags on everything, the smart grid—and has made it possible to figure out who we are from three easily attained pieces of data: birth date, gender, and zip code. So where does this leave us?

Well, we’ve allowed our data to be collected in return for services that we value and the devices that we use. Perhaps we should ask ourselves this question again: how much privacy are we willing to give up for our online services and devices? If your answer is all of it, continue doing what you’re doing. If you find yourself wondering if this may be too high a price to pay, we don’t have any easy answers for you.

There’s an old saying: you can’t unring the bell. The data that we all put out there, knowingly or not, is out there. You cannot take it back. It travels through lots of hands, and is traded and copied. Although the EU is proposing a new law, the right to be forgotten (where websites may be compelled to delete all the data it has for a specific user), here is the unvarnished truth: that same information is most likely housed in databases all over the world. You will never be able to erase it.

So what can you do? Well, you can be more conscious of what you do online and understand that pretty much everything out there is public information and is for all practical purposes immortal. You can cull down the number of sites you belong to, put up fewer photos and videos, and understand that when you search for something it’s being tracked. You can use various tools to prevent tracking as well as monitor and ensure the security of your devices from malware. You can monitor the privacy websites and engage with your legislators on what can be done in terms of privacy regulations and policies.

But if you, like us, live in the industrialized world and desire a convenient and full life, your online privacy future is bleak. You can’t unring this bell, but you can reduce your exposure, keeping in mind that (similar to Las Vegas): “What happens on the Internet stays, on the Internet forever.” Our advice: your best bet for a semblance of digital privacy is to control how much information you put out there, keep yourself informed about how privacy impacts the technologies you use, and vote with your dollars against companies that abuse your trust.

Bibliography

Peephal.com, “Privacy Leakage on Popular Web Sites,” June 17, 2011
Professor John Blackie, “The Doctrinal History of Privacy Protection in Unity and Complexity,” University of Strathclyde
Gerard Alexander, “Illiberal Europe,” American Institute for Public Policy Research, 2006
Jacob Mchangama, National Review Online, “Censorship as Tolerance,” July 19, 2010
Benjamin D. Brunk, First Monday: Peer-Reviewed Journal on the Internet, “Understanding the Privacy Space,” Volume 7, Number 10, October 2002
Joseph Turow, Lauren Feldman, & Kimberly Meltzer, Annenberg Public Policy Center, “Open to Exploitation: American Shoppers Online and Offline,” June 2005
Janice Y. Tsai, Serge Egelman, Lorrie Cranor, Alessandro Acquisti, Information Systems Research, “The Effect of Online Privacy Information on Purchasing Behavior: An Experimental Study,” Vol. 22, No. 2, June 2011, pp. 254-268
David S. Evans, University College London and University of Chicago, “The Online Advertising Industry: Economics, Evolution, and Privacy,” April 2009
Kenneth C. Laudon & Jane Price Laudon, Management Information Systems: Managing the Digital Firm, “Chapter 4: Ethical and Social Issues in Information Systems,” April 2005
Nate Anderson, arsTechnica, “U.S. Government finally admits most piracy estimates are bogus,” August 2010
GAO Report to Congressional Committees, “Intellectual Property: Observations on Efforts to Quantify the Economic Effects of Counterfeit and Pirated Goods,” April 2010
Ross Anderson, “Trusted Computing Frequently Asked Questions,” August 2003
Paul Boutin, Salon, “U.S. prepares to invade your hard drive,” March 29, 2002
Tomas Claburn, InformationWeek, “Amazon Says It Will Stop Deleting Kindle Books,” July 17,2009
Testimony and Statement for the Record of Marc Rotenberg, Director, Electronic Privacy Information Center Adjunct Professor, Georgetown University Law Center Senior Lecturer, Washington College of Law, on H.R. 2281. The WIPO Copyright Treaties Implementation Act and Privacy Issues, Before the Subcommittee on Telecommunications, Trade, and Consumer Protection, Committee on Commerce, U.S. House of Representatives, June 5, 1998
Jim Finkle, msnbc.com, “Biggest-ever series of cyber attacks uncovered, UN hit,” August 3, 2011
Cecilia Kang, The Washington Post, “Web firms face increased federal scrutiny over Internet privacy,” April 8, 2011
Julia Angwin, Wall Street Journal, “The Web’s New Gold Mine: Your Secrets,” July 30, 2010
Julia Angwin, Wall Street Journal, “Latest in Web Tracking: Stealthy Supercookies,” August 18, 2011
Geoffrey A. Fowler, Wall Street Journal, “More Questions for Wall Street,” October 18, 2010
David Goldman, CNNMoney, “Rapleaf is selling your identity,” October 21, 2010
Balanchander Krishnamurthy, Konstatin Naryshkin, Craig E. Wills, AT&T Labs and Worcester Polytechnic Institute, “Privacy leakage vs. Protection measures: the growing disconnect,” 2011
Balanchander Krishnamurthy, Craig E. Wills, ACM SIGCOMM Computer Communication Review “On the Leakage of Personally Identifiable Information Via Online Social Networks,” 2010
Emily Steel, Wall Street Journal, “WPP Ad Unit Has Your Profile,” June 27, 2011
Jenn Webb, O’Reilly Radar, “The truth about data: Once it’s out there, it’s hard to control,” April 4, 2011
Jessa Liying Wang & Michael C. Loui, University of Illinois at Urbana-Champaign, “Privacy and Ethical Issues in Location-Based Tracking Systems,” 2009
ALgorithm Adaptation, Dissemenation and IntegrationN Center, Privacy in Data Workshop, March 2003
New TRUSTe Survey Finds Consumer Education and Transparency Vital for Sustainable Growth and Success of Online Behavioral Advertising, July 25, 2011
Janice Y. Tsai, Serge Egelman, Lorrie Cranor, Alessandro Acquisti, Information Systems Research, “The Effect of Online Privacy Information on Purchasing Behavior: An Experimental Study,” Vol. 22, No. 2, June 2011
Ville Okansen, Mikko Valimaki, International Journal of Media Management, “Transnational Advocacy Network Opposing DRM – a Technical and Legal Challenge to Media Companies,” 2002
Marc Rotenberg, Stanford Technology Law Review, “Fair Information Practices and the Architecture of Privacy (What Larry Doesn’t Get), 2001
R. Anthony Reese, Columbia Journal of Law and the Arts, “Innocent Infringement in U.S. Copyright Law: A History,” Vol. 30, No. 2, 2007
Pamela Samuelson, “Privacy As Intellectual Property?,” 2000
Alberto Cerda Silva, American University Washington College of Law, “Enforcing Intellectual Property Rights by Diminishing Privacy: How the Anti-Counterfeiting Trade Agreement Jeopardizes the Right to Privacy,” September 1, 2010
Mika D. Ayenson, Dietrich J. Wambach, Ashkan Soltani, Nathaniel Good, & Jay Hoofnagle, “Flash Cookies and Privacy II: Now with HTML5 and ETag Respawning,” August 1, 2011
Burst Media, Online Insights, “Online Privacy Still a Consumer Concern,” February 2009
Burst Media, Online Insights, “Behavioral Targeting, Privacy, and the Impact on Online Advertising,” December 2010
Joseph Turow, Lauren Feldman, & Kimberly Meltzer, Annenberg Policy Center, “Open to Exploitation: American Shoppers Online and Offline,” June 1, 2005
Mary DeRosa, Center for Strategic and International Studies (CSIS), “Data Mining and Data Analysis for Counterterrorism,” March 2004
David S. Evans, Journal of Economic Perspectives, “The Online Advertising Industry: Economics, Evolution, and Privacy,” April 2009
The Constitution Project, “Principles for Government Data Mining: Preserving Civil Liberties in the Information Age,” 2010
Andy Miller, The Economist, “Untangling the Social Web,” September 2, 2010
MacDaily News, “Police adopting iPhone-based facial-recognition device, raising civil-rights questions,” July 13, 2011
Electronic Frontier Foundation, “Report on the Investigative Data Warehouse,” April 2009
Roger Wollenberg, USA Today, “NSA has massive database of Americans’ phone calls,” May 11, 2006
Ellen Nakashima, Washington Post, “FBI Shows off Counterterrorism Database,” August 30, 2006
American Civil Liberties Union, “The Positive Profiling Problem: Learning from the U.S. Experience,” October 1, 2006
Edward Hasbrouck, The Practical Nomad, “What’s in a Passenger Name Record (PNR)?”
Jay Stanley, Huff Post Politics, “Airline Passenger Profiling: Back from the Grave?,” February 8, 2011
United States General Accounting Office, “Data Mining: Federal Efforts Cover a Wide Range of Uses,” May 2004
Newton N. Minow, Fred H. Cate, McGraw Hill Handbook of Homeland Security, “Government Data Mining,” July 8, 2008, pg. 21
Ashad Mohammed and Sara Kehaulani Goo, The Washington Post, “Government Increasingly Turning to Data Mining,” June 15, 2006
Fred H.Cate, Harvard Civil Rights-Civil Liberties Law Review, “Government Data Mining: The Need for a Legal Framework,” Vol. 43, May 21, 2008

^[29]David S. Evans, University College London and University of Chicago, “The Online Advertising Industry: Economics, Evolution, and Privacy,” April 2009, pg. 9

^[30]David S. Evans, University College London and University of Chicago, “The Online Advertising Industry: Economics, Evolution, and Privacy,” April 2009, pg. 5

^[31]Kenneth C. Laudon & Jane Price Laudon, Management Information Systems: Managing the Digital Firm, “Chapter 4: Ethical and Social Issues in Information Systems,” April 2005, pg. 147

^[32]Nate Anderson, arsTechnica, “U.S. Government finally admits most piracy estimates are bogus,” August 2010

^[33]GAO Report to Congressional Committees, “Intellectual Property: Observations on Efforts to Quantify the Economic Effects of Counterfeit and Pirated Goods,” April 2010, pg. 27

^[34]Ross Anderson, “Trusted Computing Frequently Asked Questions,” August 2003

^[35]Paul Boutin, Salon, “U.S. prepares to invade your hard drive,” March 29, 2002

^[36]Ross Anderson, “Trusted Computing Frequently Asked Questions,” August 2003

^[37]Tomas Claburn, InformationWeek, “Amazon Says It Will Stop Deleting Kindle Books,” July 17,2009

^[38]Testimony and Statement for the Record of Marc Rotenberg, Director, Electronic Privacy Information Center Adjunct Professor, Georgetown University Law Center Senior Lecturer, Washington College of Law, on H.R. 2281. The WIPO Copyright Treaties Implementation Act and Privacy Issues, Before the Subcommittee on Telecommunications, Trade, and Consumer Protection, Committee on Commerce, U.S. House of Representatives, June 5, 1998

^[39]Jim Finkle, msnbc.com, “Biggest-ever series of cyber attacks uncovered, UN hit,” August 3, 2011

^[40]Jim Finkle, msnbc.com, “Biggest-ever series of cyber attacks uncovered, UN hit,” August 3, 2011

^[41]Cecilia Kang, The Washington Post, “Web firms face increased federal scrutiny over Internet privacy,” April 8, 2011

^[42]Julia Angwin, Wall Street Journal, “The Web’s New Gold Mine: Your Secrets,” July 30, 2010

^[43]Julia Angwin, Wall Street Journal, “The Web’s New Gold Mine: Your Secrets,” July 30, 2010

^[44]Geoffrey A. Fowler, Wall Street Journal, “More Questions for Wall Street,” October 18, 2010

^[45]David Goldman, CNNMoney, “Rapleaf is selling your identity,” October 21, 2010

^[46]Balanchander Krishnamurthy, Konstatin Naryshkin, Craig E. Wills, AT&T Labs and Worcester Polytechnic Institute, “Privacy leakage vs. Protection measures: the growing disconnect,” 2011, pg. 1

^[47]Balanchander Krishnamurthy, Craig E. Wills, AT&T Labs and Worcester Polytechnic Institute, “On the Leakage of Personally Identifiable Information Via Online Social Networks,” 2010, pg. 1

^[48]Balanchander Krishnamurthy, Craig E. Wills, AT&T Labs and Worcester Polytechnic Institute, “On the Leakage of Personally Identifiable Information Via Online Social Networks,” 2010, pg. 2

^[49]Emily Steel, Wall Street Journal, “WPP Ad Unit Has Your Profile,” June 27, 2011

^[50]Jenn Webb, O’Reilly Radar, “The truth about data: Once it’s out there, it’s hard to control,” April 4, 2011

^[51]Jessa Liying Wang & Michael C. Loui, University of Illinois at Urbana-Champaign, “Privacy and Ethical Issues in Location-Based Tracking Systems,” 2009, pg. 1

^[52]Andy Miller, The Economist, “Untangling the Social Web,” September 2, 2010

^[53]MacDaily News, “Police adopting iPhone-based facial-recognition device, raising civil-rights questions,” July 13, 2011

^[54]The Constitution Project, “Principles for Government Data Mining: Preserving Civil Liberties in the Information Age,” 2010,m pg. 9

^[55]Electronic Frontier Foundation, “Report on the Investigative Data Warehouse,” April 2009

^[56]Roger Wollenberg, USA Today, “NSA has massive database of Americans’ phone calls,” May 11, 2006

^[57]Jay Stanley, Huff Post Politics, “Airline Passenger Profiling: Back from the Grave?,” February 8, 2011

^[58]Newton N. Minow, Fred H. Cate, McGraw Hill Handbook of Homeland Security, “Government Data Mining,” July 8, 2008, pg. 21

^[59]Ashad Mohammed and Sara Kehaulani Goo, The Washington Post, “Government Increasingly Turning to Data Mining,” June 15, 2006

^[60]Fred H.Cate, Harvard Civil Rights-Civil Liberties Law Review, “Government Data Mining: The Need for a Legal Framework,” Vol. 43, May 21, 2008, pg. 485

^[61]URL: http://epic.org/privacy/privacy_resources_faq.html

^[62]New TRUSTe Survey Finds Consumer Education and Transparency Vital for Sustainable Growth and Success of Online Behavioral Advertising, July 25, 2011

^[63]Janice Y. Tsai, Serge Egelman, Lorrie Cranor, Alessandro Acquisti, Information Systems Research, “The Effect of Online Privacy Information on Purchasing Behavior: An Experimental Study,” Vol. 22, No. 2, June 2011, pp. 266

Get Privacy and Big Data now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Privacy and Big Data by Mary E. Ludloff, Terence Craig