You are previewing RESTful Web Services.

RESTful Web Services

Cover of RESTful Web Services by Leonard Richardson... Published by O'Reilly Media, Inc.
  1. RESTful Web Services
    1. SPECIAL OFFER: Upgrade this ebook with O’Reilly
    2. A Note Regarding Supplemental Files
    3. Foreword
    4. Preface
      1. The Web Is Simple
      2. Big Web Services Are Not Simple
      3. The Story of the REST
      4. Reuniting the Webs
      5. What’s in This Book?
      6. Administrative Notes
      7. Conventions Used in This Book
      8. Using Code Examples
      9. Safari® Enabled
      10. How to Contact Us
      11. Acknowledgments
    5. 1. The Programmable Web and Its Inhabitants
      1. Kinds of Things on the Programmable Web
      2. HTTP: Documents in Envelopes
      3. Method Information
      4. Scoping Information
      5. The Competing Architectures
      6. Technologies on the Programmable Web
      7. Leftover Terminology
    6. 2. Writing Web Service Clients
      1. Web Services Are Web Sites
      2. The Sample Application
      3. Making the Request: HTTP Libraries
      4. Processing the Response: XML Parsers
      5. JSON Parsers: Handling Serialized Data
      6. Clients Made Easy with WADL
    7. 3. What Makes RESTful Services Different?
      1. Introducing the Simple Storage Service
      2. Object-Oriented Design of S3
      3. Resources
      4. HTTP Response Codes
      5. An S3 Client
      6. Request Signing and Access Control
      7. Using the S3 Client Library
      8. Clients Made Transparent with ActiveResource
      9. Parting Words
    8. 4. The Resource-Oriented Architecture
      1. Resource-Oriented What Now?
      2. What’s a Resource?
      3. URIs
      4. Addressability
      5. Statelessness
      6. Representations
      7. Links and Connectedness
      8. The Uniform Interface
      9. That’s It!
    9. 5. Designing Read-Only Resource-Oriented Services
      1. Resource Design
      2. Turning Requirements Into Read-Only Resources
      3. Figure Out the Data Set
      4. Split the Data Set into Resources
      5. Name the Resources
      6. Design Your Representations
      7. Link the Resources to Each Other
      8. The HTTP Response
      9. Conclusion
    10. 6. Designing Read/Write Resource-Oriented Services
      1. User Accounts as Resources
      2. Custom Places
      3. A Look Back at the Map Service
    11. 7. A Service Implementation
      1. A Social Bookmarking Web Service
      2. Figuring Out the Data Set
      3. Resource Design
      4. Design the Representation(s) Accepted from the Client
      5. Design the Representation(s) Served to the Client
      6. Connect Resources to Each Other
      7. What’s Supposed to Happen?
      8. What Might Go Wrong?
      9. Controller Code
      10. Model Code
      11. What Does the Client Need to Know?
    12. 8. REST and ROA Best Practices
      1. Resource-Oriented Basics
      2. The Generic ROA Procedure
      3. Addressability
      4. State and Statelessness
      5. Connectedness
      6. The Uniform Interface
      7. This Stuff Matters
      8. Resource Design
      9. URI Design
      10. Outgoing Representations
      11. Incoming Representations
      12. Service Versioning
      13. Permanent URIs Versus Readable URIs
      14. Standard Features of HTTP
      15. Faking PUT and DELETE
      16. The Trouble with Cookies
      17. Why Should a User Trust the HTTP Client?
    13. 9. The Building Blocks of Services
      1. Representation Formats
      2. Prepackaged Control Flows
      3. Hypermedia Technologies
    14. 10. The Resource-Oriented Architecture Versus Big Web Services
      1. What Problems Are Big Web Services Trying to Solve?
      2. SOAP
      3. WSDL
      4. UDDI
      5. Security
      6. Reliable Messaging
      7. Transactions
      8. BPEL, ESB, and SOA
      9. Conclusion
    15. 11. Ajax Applications as REST Clients
      1. From AJAX to Ajax
      2. The Ajax Architecture
      3. A Example
      4. The Advantages of Ajax
      5. The Disadvantages of Ajax
      6. REST Goes Better
      7. Making the Request
      8. Handling the Response
      9. JSON
      10. Don’t Bogart the Benefits of REST
      11. Cross-Browser Issues and Ajax Libraries
      12. Subverting the Browser Security Model
    16. 12. Frameworks for RESTful Services
      1. Ruby on Rails
      2. Restlet
      3. Django
    17. A. Some Resources for REST and Some RESTful Resources
      1. Standards and Guides
      2. Services You Can Use
    18. B. The HTTP Response Code Top 42
      1. Three to Seven Status Codes: The Bare Minimum
      2. 1xx: Meta
      3. 2xx: Success
      4. 3xx: Redirection
      5. 4xx: Client-Side Error
      6. 5xx: Server-Side Error
    19. C. The HTTP Header Top Infinity
      1. Standard Headers
      2. Nonstandard Headers
    20. Index
    21. About the Authors
    22. Colophon
    23. SPECIAL OFFER: Upgrade this ebook with O’Reilly

Standard Features of HTTP

HTTP has several features designed to solve specific engineering problems. Many of these features are not widely known, either because the problems they solve don’t come up very often on the human web, or because today’s web browsers implement them transparently. When working on the programmable web, you should know about these features, so you don’t reinvent them or prematurely give up on HTTP as an application protocol.

Authentication and Authorization

By now you probably know that HTTP authentication and authorization are handled with HTTP headers—“stickers” on the HTTP “envelope.” You might not know that these headers were designed to be extensible. HTTP defines two authentication schemes, but there’s a standard way of integrating other authentication schemes into HTTP, by customizing values for the headers Authorization and WWW-Authenticate. You can even define custom authentication schemes and integrate them into HTTP: I’ll show you how that’s done by adapting a small portion of the WS-Security standard to work with HTTP authentication. But first, I’ll cover the two predefined schemes.

Basic authentication

Basic authentication is a simple challenge/response. If you try to access a resource that’s protected by basic authentication, and you don’t provide the proper credentials, you receive a challenge and you have to make the request again. It’s used by the web service I showed you in Chapter 2, as well as my mapping service in Chapter 6 and my clone in Chapter 7.

Here’s an example. I make a request for a protected resource, not realizing it’s protected:

GET /resource.html HTTP/1.1

I didn’t include the right credentials. In fact, I didn’t include any credentials at all. The server sends me the following response:

401 Unauthorized
WWW-Authenticate: Basic realm="My Private Data"

This is a challenge. The server dares me to repeat my request with the correct credentials. The WWW-Authenticate header gives two clues about what credentials I should send. It identifies what kind of authentication it’s using (in this case, Basic), and it names a realm. The realm can be any name you like, and it’s generally used to identify a collection of resources on a site. In Chapter 7 the realm was “Social bookmarking service” (I defined it in Example 7-11). A single web site might have many sets of protected resources guarded in different ways: the realm lets the client know which authentication credentials it should provide. The realm is the what, and the authentication type is the how.

To meet a Basic authentication challenge, the client needs a username and a password. This information might be filed in a cache under the name of the realm, or the client may have to prompt an end user for this information. Once the client has this information, username and password are combined into a single string and encoded with base 64 encoding. Most languages have a standard library for doing this kind of encoding: Example 8-1 uses Ruby to encode a username and password.

Example 8-1. Base 64 encoding in Ruby

# calculate-base64.rb
PASSWORD="open sesame"

require 'base64'
puts Base64.encode64("#{USER}:#{PASSWORD}")
# QWxpYmFiYTpvcGVuIHNlc2FtZQ==

This seemingly random string of characters is the value of the Authorization header. Now I can send my request again, using the username and password as Basic auth credentials.

GET /resource.html HTTP/1.1
Authorization: Basic QWxpYmFiYTpvcGVuIHNlc2FtZQ==

The server decodes this string and matches it against its user and password list. If they match, the response is processed further. If not, the request fails, and once again the status code is 401 (“Unauthorized”).

Of course, if the server can decode this string, so can anyone who snoops on your network traffic. Basic authentication effectively transmits usernames and passwords in plain text. One solution to this is to use HTTPS, also known as Transport Level Security or Secure Sockets Layer. HTTPS encrypts all communications between client and server, incidentally including the Authorization header. When I added authentication to my map service in Chapter 6, I switched from plain HTTP to encrypted HTTPS.

Digest authentication

HTTP Digest authentication is another way to hide the authorization credentials from network snoops. It’s more complex than Basic authentication, but it’s secure even over unencrypted HTTP. Digest follows the same basic pattern as Basic: the client issues a request, and gets a challenge. Here’s a sample challenge:

401 Unauthorized
WWW-Authenticate: Digest realm="My Private Data",

This time, the WWW-Authenticate header says that the authentication type is Digest. The header specifies a realm as before, but it also contains three other pieces of information, including a nonce: a random string that changes on every request.

The client’s responsibility is to turn this information into an encrypted string that proves the client knows the password, but that doesn’t actually contain the password. First the client generates a client-side nonce and a sequence number. Then the client makes a single “digest” string out of a huge amount of information: the HTTP method and path from the request, the four pieces of information from the challenge, the username and password, the client-side nonce, and the sequence number. The formula for doing this is considerably more complicated than for Basic authentication (see Example 8-2).

Example 8-2. HTTP digest calculation in Ruby

# calculate-http-digest.rb
require 'md5'

#Information from the original request

# Information from the challenge
REALM="My Private Data"

# Information calculated by or known to the client
PASSWORD="open sesame"

# Calculate the final digest in three steps.
ha1 = MD5::hexdigest("#{USER}:#{REALM}:#{PASSWORD}")
ha2 = MD5::hexdigest("#{METHOD}:#{PATH}")
ha3 = MD5::hexdigest("#{ha1}:#{NONCE}:#{NC}:#{CNONCE}:#{QOP}:#{ha2}")

puts ha3
# 2370039ff8a9fb83b4293210b5fb53e3

The digest string is similar to the S3 request signature in Chapter 3. It proves certain things about the client. You could never produce this string unless you knew the client’s username and password, knew what request the client was trying to make, and knew which challenge the server had sent in response to the first request.

Once the digest is calculated, the client resends the request and passes back all the constants (except, of course, the password), as well as the final result of the calculation:

GET /resource.html HTTP/1.1
Authorization: Digest username="Alibaba",
  realm="My Private Data",

The cryptography is considerably more complicated, but the process is the same as for HTTP Basic auth: request, challenge, response. One key difference is that even the server can’t figure out your password from the digest. When a client initially sets a password for a realm, the server needs to calculate the hash of user:realm:password (ha1 in the example above), and keep it on file. That gives the server the information it needs to calculate the final value of ha3, without storing the user’s actual password.

A second difference is that every request the client makes is actually two requests. The point of the first request is to get a challenge: it includes no authentication information, and it always fails with a status code of 401 (“Unauthorized”). But the WWW-Authenticate header includes a unique nonce, which the client can use to construct an appropriate Authorization header. It makes a second request, using this header, and this one is the one that succeeds. In Basic auth, the client can avoid the challenge by sending its authorization credentials along with the first request. That’s not possible in Digest.

Digest authentication has some options I haven’t shown here. Specifying qop=auth-int instead of qop=auth means that the calculation of ha2 above must include the request’s entity-body, not just the HTTP method and the URI path. This prevents a man-in-the-middle from tampering with the representations that accompany PUT and POST requests.

My goal here isn’t to dwell on the complex mathematics— that’s what libraries are for. I want to demonstrate the central role the WWW-Authenticate and Authorization headers play in this exchange. The WWW-Authenticate header says, “Here’s everything you need to know to authenticate, assuming you know the secret.” The Authorization header says, “I know the secret, and here’s the proof.” Everything else is parameter parsing and a few lines of code.

WSSE username token

What if neither HTTP Basic or HTTP Digest work for you? You can define your own standards for what goes into WWW-Authenticate and Authorization. Here’s one real-life example. It turns out that, for a variety of technical reasons, users with low-cost hosting accounts can’t take advantage of either HTTP Basic or HTTP Digest.[26] At one time, this was important to a segment of the Atom community. Coming up with an entirely new cryptographically secure option was beyond the ability of the Atom working group. Instead, they looked to the WS-Security specification, which defines several different ways of authenticating SOAP messages with SOAP headers. (SOAP headers are the “stickers” on the SOAP envelope I mentioned back in Chapter 1.) They took a single idea—WS-Security UsernameToken—from this standard and ported it from SOAP headers to HTTP headers. They defined an extension to HTTP that used WWW-Authenticate and Authorization in a way that people with low-cost hosting accounts could use. We call the resulting extension WSSE UsernameToken, or WSSE for short. (WSSE just means WS-Security Extension. Other extensions would have a claim to the same name, but there aren’t any others right now.)

WSSE is like Digest in that the client runs their password through a hash algorithm before sending it across the network. The basic pattern is the same: the client makes a request, gets a challenge, and formulates a response. A WSSE challenge might look like this:

HTTP/1.1 401 Unauthorized
WWW-Authenticate: WSSE realm="My Private Data", profile="UsernameToken"

Instead of Basic or Digest, the authentication type is WSSE. The realm serves the same purpose as before, and the “profile” tells the client that the server expects it to generate a response using the UsernameToken rules (as opposed to some other rule from WS-Security that hasn’t yet been ported to HTTP headers). The UsernameToken rules mean that the client generates a nonce, then hashes their password along with the nonce and the current date (see Example 8-3).

Example 8-3. Calculating a WSSE digest

# calculate-wsse-digest.rb
require 'base64'
require 'sha1'

PASSWORD = "open sesame"
NONCE = "EFD89F06CCB28C89",
CREATED = "2007-04-13T09:00:00Z"

puts Base64.encode64(SHA1.digest("#{NONCE}#{CREATED}#{PASSWORD}"))
# Z2Y59TewHV6r9BWjtHLkKfUjm2k=

Now the client can send a response to the WSSE challenge:

GET /resource.html HTTP/1.1
Authorization: WSSE profile="UsernameToken"
X-WSSE: UsernameToken Username="Alibaba",

Same headers. Different authentication method. Same message flow. Different hash algorithm. That’s all it takes to extend HTTP authentication. If you’re curious, here’s what those authentication credentials would look like as a SOAP header under the original WS-Security UsernameToken standard.

   <wsse:Password Type="wsse:PasswordDigest">

WSSE UsernameToken authentication has two big advantages. It doesn’t send the password in the clear over the network, the way HTTP Basic does, and it doesn’t require any special setup on the server side, the way HTTP Digest usually does. It’s got one big disadvantage. Under HTTP Basic and Digest, the server can keep a one-way hash of the password instead of the password itself. If the server gets cracked, the passwords are still (somewhat) safe. With WSSE UsernameToken, the server must store the password in plain text, or it can’t verify the responses to its challenges. If someone cracks the server, they’ve got all the passwords. The extra complexity of HTTP Digest is meant to stop this from happening. Security always involves tradeoffs like these.


Textual representations like XML documents can be compressed to a fraction of their original size. An HTTP client library can request a compressed version of a representation and then transparently decompress it for its user. Here’s how it works: along with an HTTP request the client sends an Accept-Encoding header that says what kind of compression algorithms the client understands. The two standard values for Accept-Encoding are compress and gzip.

GET /resource.html HTTP/1.1
Accept-Encoding: gzip,compresss

If the server understands one of the compression algorithms from Accept-Encoding, it can use that algorithm to compress the representation before serving it. The server sends the same Content-Type it would send if the representation wasn’t compressed. But it also sends the Content-Encoding header, so the client knows the document has been compressed:

200 OK
Content-Type: text/html
Content-Encoding: gzip

[Binary representation goes here]

The client decompresses the data using the algorithm given in Content-Encoding, and then treats it as the media type given as Content-Type. In this case the client would use the gzip algorithm to decompress the binary data back into an HTML document. This technique can save a lot of bandwidth, with very little cost in additional complexity.

You probably remember that I think different representations of a resource should have distinct URIs. Why do I recommend using HTTP headers to distinguish between compressed and uncompressed versions of a representation? Because I don’t think the compressed and uncompressed versions are different representations. Compression, like encryption, is something that happens to a representation in transit, and must be undone before the client can use the representation. In an ideal world, HTTP clients and servers would compress and decompress representations automatically, and programmers should not have to even think about it. Today, most web browsers automatically request compressed representations, but few programmable clients do.

Conditional GET

Conditional HTTP GET allows a server and client to work together to save bandwidth. I covered it briefly in Chapter 5, in the context of the mapping service. There, the problem was sending the same map tiles over and over again to clients who had already received them. This is a more general treatment of the same question: how can a service keep from sending representations to clients that already have them?

Neither client nor server can solve this problem alone. If the client retrieves a representation and never talks to the server again, it will never know when the representation has changed. The server keeps no application state, so it doesn’t know when a client last retrieved a certain representation. HTTP isn’t a reliable protocol anyway, and the client might not have received the representation the first time. So when the client requests a representation, the server has no idea whether the client has done this before—unless the client provides that information as part of the application state.

Conditional HTTP GET requires client and server to work together. When the server sends a representation, it sets some HTTP response headers: Last-Modified and/or ETag. When the client requests the same representation, it should send the values for those headers as If-Modified-Since and/or If-None-Match. This lets the server make a decision about whether or not to resend the representation. Example 8-4 gives a demonstration of conditional HTTP GET.

Example 8-4. Make a regular GET request, then a conditional GET request

# fetch-oreilly-conditional.rb

require 'rubygems'
require 'rest-open-uri'
uri = ''

# Make an HTTP request and then describe the response.
def request(uri, *args)
    response = open(uri, *args)
  rescue OpenURI::HTTPError => e
    response =
  puts " Status code: #{response.status.inspect}"
  puts " Representation size: #{response.size}"
  last_modified = response.meta['last-modified']
  etag = response.meta['etag']
  puts " Last-Modified: #{last_modified}"
  puts " Etag: #{etag}"
  return last_modified, etag

puts "First request:"
last_modified, etag = request(uri)

puts "Second request:"
request(uri, 'If-Modified-Since' => last_modified, 'If-None-Match' => etag)

If you run that code once, it’ll fetch twice: once normally and once conditionally. It prints information about each request. The printed output for the first request will look something like this:

First request:
 Status code: ["200", "OK"]
 Representation size: 41123
 Last-Modified: Sun, 21 Jan 2007 09:35:19 GMT
 Etag: "7359b7-a37c-45b333d7"

The Last-Modified and Etag headers are the ones that make HTTP conditional GET possible. To use them, I make the HTTP request again, but this time I use the value of Last-Modified as If-Modified-Since, and the value of ETag as If-None-Match. Here’s the result:

Second request:
 Status code: ["304", "Not Modified"]
 Representation size: 0
 Etag: "7359b7-a0a3-45b5d90e"

Instead of a 40-KB representation, the second request gets a 0-byte representation. Instead of 200 (“OK”), the status code is 304 (“Not Modified”). The second request saved 40 KB of bandwidth because it made the HTTP request conditional on the representation of actually having changed since last time. The representation didn’t change, so it wasn’t resent.

Last-Modified is a pretty easy header to understand: it’s the last time the representation of this resource changed. You may be able to view this information in your web browser by going to “view page info” or something similar. Sometimes humans check a web page’s Last-Modified time to see how recent the data is, but its main use is in conditional HTTP requests.

If-Modified-Since makes an HTTP request conditional. If the condition is met, the server carries out the request as it would normally. Otherwise, the condition fails and the server does something unusual. For If-Modified-Since, the condition is: “the representation I’m requesting must have changed after this date.” The condition succeeds when the server has a newer representation than the client does. If the client and server have the same representation, the condition fails and the server does something unusual: it omits the representation and sends a status code of 304 (“Not Modified”). That’s the server’s way of telling the client: “reuse the representation you saved from last time.”

Both client and server benefit here. The server doesn’t have to send a representation of the resource, and the client doesn’t have to wait for it. Both sides save bandwidth. This is one of the tricks underlying your web browser’s cache, and there’s no reason not to use it in custom web clients.

How does the server calculate when a representation was last modified? A web server like Apache has it easy: it mostly serves static files from disk, and filesystems already track the modification date for every file. Apache just gets that information from the filesystem. In more complicated scenarios, you’ll need to break the representation down into its component parts and see when each bit of resource state was last modified. In Chapter 7, the Last-Modified value for a list of bookmarks was the most recent timestamp in the list. If you’re not tracking this information, the bandwidth savings you get by supporting Last-Modified might make it worth your while to start tracking it.

Even when a server provides Last-Modified, it’s not totally reliable. Let’s say a client GETs a representation at 12:30:00.3 and sees a Last-Modified with the time “12:30:00.” A tenth of a second later, the representation changes, but the Last-Modified time is still “12:30:00.” If the client tries a conditional GET request using If-Modified-Since, the server will send a 304 (“Not Modified”) response, even though the resource was modified after the original GET. One second is not a high enough resolution to keep track of when a resource changes. In fact, no resolution is high enough to keep track of when a resource changes with total accuracy.

This is not quite satisfactory. The world cries out for a completely reliable way of checking whether or not a representation has been modified since last you retrieved it. Enter the Etag response header. The Etag (it stands for “entity tag”) is a nonsensical string that must change whenever the corresponding representation changes.

The If-None-Match request header is to Etag as the If-Modified-Since request header is to Last-Modified. It’s a way of making an HTTP request conditional. In this case, the condition is “the representation has changed, as embodied in the entity tag.” It’s supposed to be a totally reliable way of identifying changes between representations.

It’s easy to generate a good ETag for any representation. Transformations like the MD5 hash can turn any string of bytes into a short string that’s unique except in pathological cases. The problem is, by the time you can run one of those transformations, you’ve already created the representation as a string of bytes. You may save bandwidth by not sending the representation over the wire, but you’ve already done everything necessary to build it.

The Apache server uses filesystem information like file size and modification time to generate Etag headers for static files without reading their contents. You might be able to do the same thing for your representations: pick the data that tends to change, or summary data that changes along with the representation. Instead of doing an MD5 sum of the entire representation, just do a sum of the important data. The Etag header doesn’t need to incorporate every bit of data in the representation: it just has to change whenever the representation changes.

If a server provides both Last-Modified and Etag, the client can provide both If-Modified-Since and If-None-Match in subsequent requests (as I did in Example 8-4). The server should make both checks: it should only send a new representation if the representation has changed and the Etag is different.


Conditional HTTP GET gives the client a way to refresh a representation by making a GET request that uses very little bandwidth if the representation has not changed. Caching gives the client some rough guidelines that can make it unnecessary to make that second GET request at all.

HTTP caching is a complex topic, even though I’m limiting my discussion to client-side caches and ignoring proxy caches that sit between the client and the server.[27]The basics are these: when a client makes an HTTP GET or HEAD request, it might be able to cache the HTTP response document, headers and all. The next time the client is asked to make the same GET or HEAD request, it may be able to return the cached document instead of actually making the request again. From the perspective of the user (a human using a web browser, or a computer program using an HTTP library), caching is transparent. The user triggers a request, but instead of making an actual HTTP request, the client retrieves a cached response from the server and presents it as though it were freshly retrieved. I’m going to focus on three topics from the point of view of the service provider: how you can tell the client to cache, how you can tell the client not to cache, and when the client might be caching without you knowing it.

Please cache

When the server responds to a GET or HEAD request, it may send a date in the response header Expires. For instance:

Expires: Tue, 30 Jan 2007 17:02:06 GMT

This header tells the client (and any proxies between the server and client) how long the response may be cached. The date may range from a date in the past (meaning the response has expired by the time it gets to the client) to a date a year in the future (which means, roughly, “the response will never expire”). After the time specified in Expires, the response becomes stale. This doesn’t mean that it must be removed from the cache immediately. The client might be able to make a conditional GET request, find out that the response is actually still fresh, and update the cache with a new expiration date.

The value of Expires is a rough guide, not an exact date. Most services can’t predict to the second when a response is going to change. If Expires is an hour in the future, that means the server is pretty sure the response won’t change for at least an hour. But something could legitimately happen to the resource the second after that response is sent, invalidating the cached response immediately. When in doubt, the client can make another HTTP request, hopefully a conditional one.

The server should not send an Expires that gives a date more than a year in the future. Even if the server is totally confident that a particular response will never change, a year is a long time. Software upgrades and other events in the real world tend to invalidate cached responses sooner than you’d expect.

If you don’t want to calculate a date at which a response should become stale, you can use Cache-Control to say that a response should be cached for a certain number of seconds. This response can be cached for an hour:

Cache-Control: max-age=3600

Thank you for not caching

That covers the case when the server would like the client to cache. What about the opposite? Some responses to GET requests are dynamically generated and different every time: caching them would be useless. Some contain sensitive information that shouldn’t be stored where someone else might see it: caching them would cause security problems. Use the Cache-Control header to convey that the client should not cache the representation at all:

Cache-Control: no-cache

Where Expires is a fairly simple response header, Cache-Control header is very complex. It’s the primary interface for controlling client-side caches, and proxy caches between the client and server. It can be sent as a request or as a response header, but I’m just going to talk about its use as a response header, since my focus is on how the server can work with a client-side cache.

I already showed how specifying “max-age” in Cache-Control controls how long a response can stay fresh in a cache. A value of “no-cache” prevents the client from caching a response at all. A third value you might find useful is “private,” which means that the response may be cached by a client cache, but not by any proxy cache between the client and server.

Default caching rules

In the absence of Expires or Cache-Control, section 13 of the HTTP standard defines a complex set of rules about when a client can cache a response. Unless you’re going to set caching headers on every response, you’ll need to know when a client is likely to cache what you send, so that you can override the defaults when appropriate. I’ll summarize the basic common-sense rules here.

In general, the client may cache the responses to its successful HTTP GET and HEAD requests. “Success” is defined in terms of the HTTP status code: the most common success codes are 200 (“OK”), 301 (“Moved Permanently”), and 410 (“Gone”).

Many (poorly-designed) web applications expose URIs that trigger side effects when you GET them. These dangerous URIs usually contain query strings. The HTTP standard recommends that if a URI contains a query string, the response from that URI should not be automatically cached: it should only be cached if the server explicitly says caching is OK. If the client GETs this kind of URI twice, it should trigger the side effects twice, not trigger them once and then get a cached copy of the response from last time.

If a client then finds itself making a PUT, POST, or DELETE request to a URI, any cached responses from that URI immediately become stale. The same is true of any URI mentioned in the Location or Content-Location of a response to a PUT, POST, or DELETE request. There’s a wrinkle here, though: site A can’t affect how the client caches responses from site B. If you POST to, then any cached response from is automatically stale. If the response comes back with a Location of, then any cached response from is also stale. But if the Location is, it’s not OK to consider a cached response from to be stale. The site at doesn’t tell what to do.

If none of these rules apply, and if the server doesn’t specify how long to cache a response, the decision falls to the client side. Responses may be removed at any time or kept forever. More realistically, a client-side cache should consider a response to be stale after some time between an hour and a day. Remember that a stale response doesn’t have to be removed from the cache: the client might make a conditional GET request to check whether the cached response can still be used. If the condition succeeds, the cached response is still fresh and it can stay in the cache.

Look-Before-You-Leap Requests

Conditional GET is designed to save the server from sending enormous representations to a client that already has them. Another feature of HTTP, less often used, can save the client from fruitlessly sending enormous (or sensitive) representations to the server. There’s no official name for this kind of request, so I’ve came up with a silly name: look-before-you-leap requests.

To make a LBYL request, a client sends a PUT or POST request normally, but omits the entity-body. Instead, the client sets the Expect request header to the string “100-continue”. Example 8-5 shows a sample LBYL request.

Example 8-5. A sample look-before-you-leap request

PUT /filestore/myfile.txt HTTP/1.1
Content-length: 524288000
Expect: 100-continue

This is not a real PUT request: it’s a question about a possible future PUT request. The client is asking the server: “would you allow me to PUT a new representation to the resource at /filestore/myfile.txt?” The server makes its decision based on the current state of that resource, and the HTTP headers provided by the client. In this case the server would examine Content-length and decide whether it’s willing to accept a 500 MB file.

If the answer is yes, the server sends a status code of 100 (“Continue”). Then the client is expected to resend the PUT request, omitting the Expect and including the 500-MB representation in the entity-body. The server has agreed to accept that representation.

If the answer is no, the server sends a status code of 417 (“Expectation Failed”). The answer might be no because the resource at /filestore/myfile.txt is write-protected, because the client didn’t provide the proper authentication credentials, or because 500 MB is just too big. Whatever the reason, the initial look-before-you-leap request has saved the client from sending 500 MB of data only to have that data rejected. Both client and server are better off.

Of course, a client with a bad representation can lie about it in the headers just to get a status code of 100, but it won’t do any good. The server won’t accept a bad representation on the second request, any more than it would have on the first request.

Partial GET

Partial HTTP GET allows a client to fetch only a subset of a representation. It’s usually used to resume interrupted downloads. Most web servers support partial GET for static content; so does Amazon’s S3 service.

Example 8-6 is a bit of code that makes two partial HTTP GET requests to the same URI. The first request gets bytes 10 through 20, and the second request gets everything from byte 40,000 to the end.

Example 8-6. Make two partial HTTP GET requests

# fetch-oreilly-partial.rb

require 'rubygems'
require 'rest-open-uri'
uri = ''

# Make a partial HTTP request and describe the response.
def partial_request(uri, range)
    response = open(uri, 'Range' => range)
  rescue OpenURI::HTTPError => e
    response =

  puts " Status code: #{response.status.inspect}"
  puts " Representation size: #{response.size}"
  puts " Content Range: #{response.meta['content-range']}"
  puts " Etag: #{response.meta['etag']}"

puts "First request:"
partial_request(uri, "bytes=10-20")

puts "Second request:"
partial_request(uri, "bytes=40000-")

When I run that code I see this for the first request:

First request:
 Status code: ["206", "Partial Content"]
 Representation size: 11
 Content Range: bytes 10-20/41123
 Etag: "7359b7-a0a3-45b5d90e"

Instead of 40 KB, the server has only sent me the 11 bytes I requested. Similarly for the second request:

Second request:
 Status code: ["206", "Partial Content"]
 Representation size: 1123
 Content Range: bytes 40000-41122/41123
 Etag: "7359b7-a0a3-45b5d90e"

Note that the Etag is the same in both cases. In fact, it’s the same as it was back when I ran the conditional GET code back in Example 8-4. The value of Etag is always a value calculated for the whole document. That way I can combine conditional GET and partial GET.

Partial GET might seem like a way to let the client access subresources of a given resource. It’s not. For one thing, a client can only address part of a representation by giving a byte range. That’s not very useful unless your representation is a binary data structure. More importantly, if you’ve got subresources that someone might want to talk about separately from the containing resource, guess what: you’ve got more resources. A resource is anything that might be the target of a hypertext link. Give those subresources their own URIs.

[26] Documented by Mark Pilgrim in “Atom Authentication” on

[27] For more detailed coverage, see section 13 of RFC 2616, and Chapter 7 of HTTP: The Definitive Guide, by Brian Totty and David Gourley (O’Reilly).

The best content for your career. Discover unlimited learning on demand for around $1/day.