Privacy Design

In Chapter 5, I talk about all aspects of security and the cloud. As we design your overall application architecture in this chapter, however, it is important to consider how you approach an application architecture for systems that have a special segment of private data, notably e-commerce systems that store credit cards and health care systems with health data. We take a brief look at privacy design here, knowing that a full chapter of security awaits us later.

Privacy in the Cloud

The key to privacy in the cloud—or any other environment—is the strict separation of sensitive data from nonsensitive data followed by the encryption of sensitive elements. The simplest example is storing credit cards. You may have a complex e-commerce application storing many data relationships, but you need to separate out the credit card data from the rest of it to start building a secure e-commerce infrastructure.

Note

When I say you need to separate the data, what I mean is that access to either of the two pieces of your data cannot compromise the privacy of the data. In the case of a credit card, you need to store the credit card number on a different virtual server in a different network segment and encrypt that number. Access to the first set of data provides only customer contact info; access to the credit card number provides only an encrypted credit card number.

Figure 4-5 provides an application architecture in which credit card data can be securely managed.

Host credit card data behind a web service that encrypts credit card data
Figure 4-5. Host credit card data behind a web service that encrypts credit card data

It’s a pretty simple design that is very hard to compromise as long as you take the following precautions:

  • The application server and credit card server sit in two different security zones with only web services traffic from the application server being allowed into the credit card processor zone.

  • Credit card numbers are encrypted using a customer-specific encryption key.

  • The credit card processor has no access to the encryption key, except for a short period of time (in memory) while it is processing a transaction on that card.

  • The application server never has the ability to read the credit card number from the credit card server.

  • No person has administrative access to both servers.

Under this architecture, a hacker has no use for the data on any individual server; he must hack both servers to gain access to credit card data. Of course, if your web application is poorly written, no amount of structure will protect you against that failing.

You therefore need to minimize the ability of a hacker to use one server to compromise the other. Because this problem applies to general cloud security, I cover it in detail in Chapter 5. For now, I’ll just list a couple rules of thumb:

  • Make sure the two servers have different attack vectors. In other words, they should not be running the same software. By following this guideline, you guarantee that whatever exploit compromised the first server is not available to compromise the second server.

  • Make sure that neither server contains credentials or other information that will make it possible to compromise the other server. In other words, don’t use passwords for user logins and don’t store any private SSH keys on either server.

Managing the credit card encryption

In order to charge a credit card, you must provide the credit card number, an expiration date, and a varying number of other data elements describing the owner of the credit card. You may also be required to provide a security code.

This architecture separates the basic capture of data from the actual charging of the credit card. When a person first enters her information, the system stores contact info and some basic credit card profile information with the e-commerce application and sends the credit card number over to the credit card processor for encryption and storage.

The first trick is to create a password on the e-commerce server and store it with the customer record. It’s not a password that any user will ever see or use, so you should generate something complex using the strongest password guidelines. You should also create a credit card record on the e-commerce server that stores everything except the credit card number. Figure 4-6 shows a sample e-commerce data model.

The e-commerce system stores everything but the credit card number and security code
Figure 4-6. The e-commerce system stores everything but the credit card number and security code

With that data stored in the e-commerce system database, the system then submits the credit card number, credit card password, and unique credit card ID from the e-commerce system to the credit card processor.

The credit card processor does not store the password. Instead, it uses the password as salt to encrypt the credit card number, stores the encrypted credit card number, and associates it with the credit card ID. Figure 4-7 shows the credit card processor data model.

The credit card processor stores the encrypted credit card number and associates it with the e-commerce credit card ID
Figure 4-7. The credit card processor stores the encrypted credit card number and associates it with the e-commerce credit card ID

Neither system stores a customer’s security code, because the credit card companies do not allow you to store this code.

Processing a credit card transaction

When it comes time to charge the credit card, the e-commerce service submits a request to the credit card processor to charge the card for a specific amount. The e-commerce system refers to the credit card on the credit card processor using the unique ID that was created when the credit card was first inserted. It passes over the credit card password, the security code, and the amount to be charged. The credit card processor then decrypts the credit card number for the specified credit card using the specified password. The unencrypted credit card number, security code, and amount are then passed to the bank to complete the transaction.

If the e-commerce application is compromised

If the e-commerce application is compromised, the attacker has access only to the nonsensitive customer contact info. There is no mechanism by which he can download that database and access credit card information or otherwise engage in identity theft. That would require compromising the credit card processor separately.

Having said all of that, if your e-commerce application is insecure, an attacker can still assume the identity of an existing user and place orders in their name with deliveries to their address. In other words, you still need to worry about the design of each component of the system.

Note

Obviously, you don’t want intruders gaining access to your customer contact data either. In the context of this section, my references to customer contact data as “nonsensitive” is a relative term. Your objective should be to keep an intruder from getting to either bit of data.

If the credit card processor is compromised

Compromising the credit card processor is even less useful than compromising the e-commerce application. If an attacker gains access to the credit card database, all he has are random unique IDs and strongly encrypted credit card numbers—each encrypted with a unique encryption key. As a result, the attacker can take the database offline and attempt to brute-force decrypt the numbers, but each number will take a lot of time to crack and, ultimately, provide the hacker with a credit card number that has no individually identifying information to use in identity theft.

Another attack vector would be to figure out how to stick a Trojan application on the compromised server and listen for decryption passwords. However, if you are running intrusion detection software as suggested in Chapter 5, even this attack vector becomes unmanageable.

When the Amazon Cloud Fails to Meet Your Needs

The architecture I described in the previous section matches traditional noncloud deployments fairly closely. You may run into challenges deploying in the Amazon cloud, however, because of a couple of critical issues involving the processing of sensitive data:

  • Some laws and specifications impose conditions on the political and legal jurisdictions where the data is stored. In particular, companies doing business in the EU may not store private data about EU citizens on servers in the U.S. (or any other nation falling short of EU privacy standards).

  • Some laws and specifications were not written with virtualization in mind. In other words, they specify physical servers in cases where virtual servers would do identically well, simply because a server meant a physical server at the time the law or standard was written.

The first problem has a pretty clear solution: if you are doing business in the EU and managing private data on EU citizens, that data must be handled on servers with a physical presence in the EU, stored on storage devices physically in the EU, and not pass through infrastructure managed outside the EU.

Amazon provides a presence in both the U.S. and EU. As a result, you can solve the first problem by carefully architecting your Amazon solution. It requires, however, that you associate the provisioning of instances and storage of data with your data management requirements.

The second issue is especially problematic for solutions such as Amazon that rely entirely on virtualization. In this case, however, it’s for fairly stupid reasons. You can live up to the spirit of the law or specification, but because the concept of virtualization was not common at the time, you cannot live up to the letter of the law or specification. The workaround for this scenario is similar to the workaround for the first problem.

In solving these challenges, you want to do everything to realize as many of the benefits of the cloud as possible without running private data through the cloud and without making the overall complexity of the system so high that it just isn’t worth it. Cloud providers such as Rackspace and GoGrid tend to make such solutions easier than attempting a hybrid solution with Amazon and something else.

To meet this challenge, you must route and store all private information outside the cloud, but execute as much application logic as possible inside the cloud. You can accomplish this goal by following the general approach I described for credit card processing and abstracting the concepts out into a privacy server and a web application server:

  • The privacy server sits outside the cloud and has the minimal support structures necessary to handle your private data.

  • The web application server sits inside the cloud and holds the bulk of your application logic.

Pulling private data out of the cloud creates three different application components
Figure 4-8. Pulling private data out of the cloud creates three different application components

Because the objective of a privacy server is simply to physically segment out private data, you do not necessarily need to encrypt everything on the privacy server. Figure 4-8 illustrates how the e-commerce system might evolve into a privacy architecture designed to store all private data outside of the cloud.

As with the cloud-based e-commerce system, you store credit card data on its own server in its own network segment. The only difference for the credit card processor is that this time it is outside of the cloud.

The new piece to this puzzle is the customer’s personally identifying information. This data now exists on its own server outside of the cloud, but still separate from credit card data. When saving user profile information, those actions execute against the privacy server instead of the main web application. Under no circumstances does the main web application have any access to personally identifying information, unless that data is aggregated before being presented to the web application.

How useful this architecture is depends heavily on how much processing you are doing that has nothing to do with private data. If all of your transactions involve the reading and writing of private data, you gain nothing by adding this complexity. On the other hand, if the management of private data is just a tiny piece of the application, you can gain all of the advantages of the cloud for the other parts of the application while still respecting any requirements around physical data location.

Get Cloud Application Architectures now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.