You are previewing RESTful Web Services.

RESTful Web Services

Cover of RESTful Web Services by Leonard Richardson... Published by O'Reilly Media, Inc.
  1. RESTful Web Services
    1. SPECIAL OFFER: Upgrade this ebook with O’Reilly
    2. A Note Regarding Supplemental Files
    3. Foreword
    4. Preface
      1. The Web Is Simple
      2. Big Web Services Are Not Simple
      3. The Story of the REST
      4. Reuniting the Webs
      5. What’s in This Book?
      6. Administrative Notes
      7. Conventions Used in This Book
      8. Using Code Examples
      9. Safari® Enabled
      10. How to Contact Us
      11. Acknowledgments
    5. 1. The Programmable Web and Its Inhabitants
      1. Kinds of Things on the Programmable Web
      2. HTTP: Documents in Envelopes
      3. Method Information
      4. Scoping Information
      5. The Competing Architectures
      6. Technologies on the Programmable Web
      7. Leftover Terminology
    6. 2. Writing Web Service Clients
      1. Web Services Are Web Sites
      2. del.icio.us: The Sample Application
      3. Making the Request: HTTP Libraries
      4. Processing the Response: XML Parsers
      5. JSON Parsers: Handling Serialized Data
      6. Clients Made Easy with WADL
    7. 3. What Makes RESTful Services Different?
      1. Introducing the Simple Storage Service
      2. Object-Oriented Design of S3
      3. Resources
      4. HTTP Response Codes
      5. An S3 Client
      6. Request Signing and Access Control
      7. Using the S3 Client Library
      8. Clients Made Transparent with ActiveResource
      9. Parting Words
    8. 4. The Resource-Oriented Architecture
      1. Resource-Oriented What Now?
      2. What’s a Resource?
      3. URIs
      4. Addressability
      5. Statelessness
      6. Representations
      7. Links and Connectedness
      8. The Uniform Interface
      9. That’s It!
    9. 5. Designing Read-Only Resource-Oriented Services
      1. Resource Design
      2. Turning Requirements Into Read-Only Resources
      3. Figure Out the Data Set
      4. Split the Data Set into Resources
      5. Name the Resources
      6. Design Your Representations
      7. Link the Resources to Each Other
      8. The HTTP Response
      9. Conclusion
    10. 6. Designing Read/Write Resource-Oriented Services
      1. User Accounts as Resources
      2. Custom Places
      3. A Look Back at the Map Service
    11. 7. A Service Implementation
      1. A Social Bookmarking Web Service
      2. Figuring Out the Data Set
      3. Resource Design
      4. Design the Representation(s) Accepted from the Client
      5. Design the Representation(s) Served to the Client
      6. Connect Resources to Each Other
      7. What’s Supposed to Happen?
      8. What Might Go Wrong?
      9. Controller Code
      10. Model Code
      11. What Does the Client Need to Know?
    12. 8. REST and ROA Best Practices
      1. Resource-Oriented Basics
      2. The Generic ROA Procedure
      3. Addressability
      4. State and Statelessness
      5. Connectedness
      6. The Uniform Interface
      7. This Stuff Matters
      8. Resource Design
      9. URI Design
      10. Outgoing Representations
      11. Incoming Representations
      12. Service Versioning
      13. Permanent URIs Versus Readable URIs
      14. Standard Features of HTTP
      15. Faking PUT and DELETE
      16. The Trouble with Cookies
      17. Why Should a User Trust the HTTP Client?
    13. 9. The Building Blocks of Services
      1. Representation Formats
      2. Prepackaged Control Flows
      3. Hypermedia Technologies
    14. 10. The Resource-Oriented Architecture Versus Big Web Services
      1. What Problems Are Big Web Services Trying to Solve?
      2. SOAP
      3. WSDL
      4. UDDI
      5. Security
      6. Reliable Messaging
      7. Transactions
      8. BPEL, ESB, and SOA
      9. Conclusion
    15. 11. Ajax Applications as REST Clients
      1. From AJAX to Ajax
      2. The Ajax Architecture
      3. A del.icio.us Example
      4. The Advantages of Ajax
      5. The Disadvantages of Ajax
      6. REST Goes Better
      7. Making the Request
      8. Handling the Response
      9. JSON
      10. Don’t Bogart the Benefits of REST
      11. Cross-Browser Issues and Ajax Libraries
      12. Subverting the Browser Security Model
    16. 12. Frameworks for RESTful Services
      1. Ruby on Rails
      2. Restlet
      3. Django
    17. A. Some Resources for REST and Some RESTful Resources
      1. Standards and Guides
      2. Services You Can Use
    18. B. The HTTP Response Code Top 42
      1. Three to Seven Status Codes: The Bare Minimum
      2. 1xx: Meta
      3. 2xx: Success
      4. 3xx: Redirection
      5. 4xx: Client-Side Error
      6. 5xx: Server-Side Error
    19. C. The HTTP Header Top Infinity
      1. Standard Headers
      2. Nonstandard Headers
    20. Index
    21. About the Authors
    22. Colophon
    23. SPECIAL OFFER: Upgrade this ebook with O’Reilly
O'Reilly logo

Appendix C. The HTTP Header Top Infinity

There are already two excellent guides to the standard HTTP headers. One’s in the HTTP standard itself, and the other’s in print, in Appendix C of HTTP: The Definitive Guide by Brian Totty and David Gourley (O’Reilly). In this description I’m giving a somewhat perfunctory description of the standard HTTP headers. For each header, I’ll say whether it’s found in HTTP requests, responses, or both. I’ll give my opinion as to how useful the header is when building resource-oriented web services, as opposed to other HTTP-based software like web applications and HTTP proxies. I’ll give a short description of the header, which will get a little longer for tricky or especially important headers. I won’t go into detail on what the header values should look like. I figure you’re smart and you can look up more detailed information as needed.

In Chapter 1 I compared an HTTP request or response to an envelope that contains a document (an entity-body). I compared HTTP headers to informational stickers on the envelope. It’s considered very bad form to come up with your own HTTP methods or response codes, but it’s fine to come up with your own stickers. After covering the standard HTTP headers I’ll mention a few custom headers that have become de facto parts of HTTP, like Cookie; or that are used in important technologies, like WSSE’s X-WSSE and the Atom Publishing Protocol’s Slug.

Custom headers are the most common way of extending HTTP. So long as client and server agree on what the headers mean, you can send any information you like along with a request or response. The guidelines are: don’t reinvent an existing header, don’t put things in headers that belong in the entity-body, and follow the naming convention. The names of custom headers should start with the string “X-,” meaning “extension.” The convention makes it clear that your headers are extension headers, and avoids any conflict with future official HTTP headers.

Amazon’s S3, covered in Chapter 3, is a good example of a service that defines custom headers. Not only does it define headers like X-amz-acl and X-amz-date, it specifies that S3 clients can send any header whose name begins with “X-amz-meta-.” The header name and value are associated with an object as a key-value pair, letting you store arbitrary metadata with your buckets and objects. This is a naming convention inside a naming convention.

Standard Headers

These are the 46 headers listed in the HTTP standard.

Accept

Type: Request header.

Importance: Medium.

The client sends an Accept header to tell the server what data formats it would prefer the server use in its representations. One client might want a JSON representation; another might want an RDF representation of the same data.

Hiding this information inside the HTTP headers is a good idea for web browsers, but it shouldn’t be the only solution for web service clients. I recommend exposing different representations using different URIs. This doesn’t mean you have to impose crude rules like appending .html to the URI for an HTML representation (though that’s what Rails does). But I think the information should be in the URI somehow. If you want to support Accept on top of this, that’s great (Rails does this too).

Accept-Charset

Type: Request header.

Importance: Low.

The client sends an Accept-Charset header to tell the server what character set it would like the server to use in its representations. One client might want the representation of a resource containing Japanese text to be encoded in UTF-8; another might want a Shift-JIS encoding of the same data.

As I said in Chapter 8, your headaches will be fewer if you pick a Unicode encoding (either UTF-8 or UTF-16) and stick with it. Any modern client should be able to handle these encodings.

Accept-Encoding

Type: Request header.

Importance: Medium to high.

The client sends an Accept-Encoding header to tell the server that it can save some bandwidth by compressing the response entity-body with a well-known algorithm like compress or gzip. Despite the name, this has nothing to do with character set encoding; that’s Accept-Charset.

Technically, Accept-Encoding could be used to apply some other kind of transform to the entity-body: applying rot13 encryption to all of its text, maybe. In practice, it’s only used to compress data.

Accept-Language

Type: Request header.

Importance: Low.

The client sends an Accept-Language header to tell the server what human language it would like the server to use in its representations. For an example, see Chapter 4 and its discussion of a press release that’s available in both English and Spanish.

As with media types, I think that a web service should expose different-language representations of a given resource with different URIs. Supporting Accept-Language on top of this is a bonus.

Accept-Ranges

Type: Response header.

Importance: Low to medium.

The server sends this header to indicate that it supports partial HTTP GET (see Chapter 8) for the requested URI. A client can make a HEAD request to a URI, parse the value of this response header, and then send a GET request to the same URI, providing an appropriate Range header.

Age

Type: Response header.

Importance: Low.

If the response entity-body does not come fresh from the server, the Age header is a measure of how long ago it left the server. This header is usually set by HTTP caches, so that the client knows it might be getting an old copy of a representation.

Allow

Type: Response header.

Importance: Potentially high, currently low.

I discuss this header in HEAD and OPTIONS”, in Chapter 4. It’s sent in response to an OPTIONS request and tells the client which subset of the uniform interface a particular URI exposes. This header will become much more important if people ever start using OPTIONS.

Authorization

Type: Request header.

Importance: Very high.

This request header contains authorization credentials, such as a username and password, which the client has encoded according to some agreed-upon scheme. The server decodes the credentials and decides whether or not to carry out the request.

In theory, this is the only authorization header anyone should ever need (except for Proxy-Authorization, which works on a different level), because it’s extensible. The most common schemes are HTTP Basic and HTTP Digest, but the scheme can be anything, so long as both client and server understand it. In practice, HTTP itself has been extended, with unofficial request headers like X-WSSE that work on top of Authorization. See the X-WSSE entry below for the reason why.

Cache-Control

Type: Request and response header.

Importance: Medium.

This header contains a directive to any caches between the client and the server (including any caches on the client or server themselves). It spells out the rules for how the data should be cached and when it should be dumped. I cover some simple caching rules and recipes in Caching” in Chapter 8.

Connection

Type: Response header.

Importance: Low.

Most of an HTTP response is a communication from the server to the client. Intermediaries like proxies can look at the response, but nothing in there is aimed at them. But a server can insert extra headers that are aimed at a proxy, and one proxy can insert headers that are aimed at the next proxy in a chain. When this happens, the special headers are named in the Connection header. These headers apply to the TCP connection between one machine and another, not to the HTTP connection between server and client. Before passing on the response, the proxy is supposed to remove the special headers and the Connection header itself. Of course, it may add its own special communications, and a new Connection header, if it wants.

Here’s a quick example, since this isn’t terribly relevant to this book. The server might send these three HTTP headers in a response that goes through a proxy:

Content-Type: text/plain
X-Proxy-Directive: Deliver this as fast as you can!
Connection: X-Proxy-Directive

The proxy would remove X-Proxy-Directive and Connection, and send the one remaining header to the client:

Content-Type: text/plain

If you’re writing a client and not using proxies, the only value you’re likely to see for Connection is “close.” That just says that the server will close the TCP connection after completing this request, which is probably what you expected anyway.

Content-Encoding

Type: Response header.

Importance: Medium to high.

This response header is the counterpart to the request header Accept-Encoding. The request header asks the server to compress the entity-body using a certain algorithm. This header tells the client which algorithm, if any, the server actually used.

Content-Language

Type: Response header.

Importance: Medium.

This response header is the counterpart to the Accept-Language request header, or to a corresponding variable set in a resource’s URI. It specifies the natural language a human must understand to get meaning out of the entity-body.

There may be multiple languages listed here. If the entity-body is a movie in Mandarin with Japanese subtitles, the value for Content-Language might be “zh-guoyu,jp.” If one English phrase shows up in the movie, “en” would probably not show up in the Content-Language header.

Content-Length

Type: Response header.

Importance: High.

This response header gives the size of the entity-body in bytes. This is important for two reasons: first, a client can read this and prepare for a small entity-body or a large one. Second, a client can make a HEAD request to find out how large the entity-body is, without actually requesting it. The value of Content-Length might affect the client’s decision to fetch the entire entity-body, fetch part of it with Range, or not fetch it at all.

Content-Location

Type: Response header.

Importance: Low.

This header tells the client the canonical URI of the resource it requested. Unlike with the value of the Location header, this is purely informative. The client is not expected to start using the new URI.

This is mainly useful for services that assign different URIs to different representations of the same resource. If the client wants to link to the specific representation obtained through content negotiation, it can use the URI given in Content-Location. So if you request /releases/104, and use the Accept and Accept-Language headers to specify an HTML representation written in English, you might get back a response that specifies /releases/104.html.en as the value for Content-Location.

Content-MD5

Type: Response header.

Importance: Low to medium.

This is a cryptographic checksum of the entity-body. The client can use this to check whether or not the entity-body was corrupted in transit. An attacker (such as a man-in-the-middle) can change the entity-body and change the Content-MD5 header to match, so it’s no good for security, just error detection.

Content-Range

Type: Response header.

Importance: Low to medium.

When the client makes a partial GET request with the Range request header, this response header says what part of the representation the client is getting.

Content-Type

Type: Response header.

Importance: Very high.

Definitely the most famous response header. This header tells the client what kind of thing the entity-body is. On the human web, a web browser uses this to decide if it can display the entity-body inline, and which external program it must run if not. On the programmable web, a web service client usually uses this to decide which parser to apply to the entity-body.

Date

Type: Request and response header.

Importance: High for request, required for response.

As a request header, this represents the time on the client at the time the request was sent. As a response header, it represents the time on the server at the time the request was fulfilled. As a response header, Date is used by caches.

ETag

Type: Response header.

Importance: Very high.

The value of ETag is an opaque string designating a specific version of a representation. Whenever the representation changes, the ETag should also change.

Whenever possible, this header ought to be sent in response to GET requests. Clients can use the value of ETag in future conditional GET requests, as the value of If-None-Match. If the representation hasn’t changed, the ETag hasn’t changed either, and the server can save time and bandwidth by not sending the representation again.

The main driver of conditional GET requests is the simpler Last-Modified response header, and its request counterpart If-Modified-Since. The main purpose of ETag is to provide a second line of defense. If a representation changes twice in one second, it will take on only one value for Last-Modified-Since, but two different values for ETag.

Expect

Type: Request header.

Importance: Medium, but rarely used (as of time of writing).

This header is used to signal a LBYL request (covered in Chapter 8). The server will send the response code 100 (“Continue”) if the client should “leap” ahead and make the real request. It will send the response code 417 (“Expectation Failed”) if the client should not “leap.”

Expires

Type: Response header.

Importance: Medium.

This header tells the client, or a proxy between the server and client, that it may cache the response (not just the entity-body!) until a certain time. Even a conditional HTTP GET makes an HTTP connection and takes time and resources. By paying attention to Expires, a client can avoid the need to make any HTTP requests at all—at least for a while. I cover caching briefly in Chapter 8.

The client should take the value of Expires as a rough guide, not as a promise that the entity-body won’t change until that time.

From

Type: Request header.

Importance: Very low.

This header works just like the From header in an email message. It gives an email address associated with the person making the request. This is never used on the human web because of privacy concerns, and it’s used even less on the programmable web, where the clients aren’t under the control of human beings. You might want to use it as an extension to User-Agent.

Host

Type: Request header.

Importance: Required.

This header contains the domain name part of the URI. If a client makes a GET request for http://www.example.com/page.html, then the URI path is /page.html and the value of the Host header is “www.example.com” or “www.example.com:80.”

From the client’s point of view, this may seem like a strange header to require. It’s required because an HTTP 1.1 server can host any number of domains on a single IP address. This feature is called “name-based virtual hosting,” and it saves someone who owns multiple domain names from having to buy a separate computer and/or network card for each one. The problem is that an HTTP client sends requests to an IP address, not to a domain name. Without the Host header, the server has no idea which of its virtual hosts is the target of the client’s request.

If-Match

Type: Request header.

Importance: Medium.

This header is best described in terms of other headers. It’s used like If-Unmodified-Since (described later), to make HTTP actions other than GET conditional. But where If-Unmodified-Since takes a time as its value, this header takes an ETag as its value.

Tersely, this header is to If-None-Match and ETag as If-Unmodified-Since is to If-Modified-Since and Last-Modified.

If-Modified-Since

Type: Request header.

Importance: Very high.

This request header is the backbone of conditional HTTP GET. Its value is a previous value of the Last-Modified response header, obtained from a previous request to this URI. If the resource has changed since that last request, its new Last-Modified date is more recent than the one. That means that the condition If-Modified-Since is met, and the server sends the new entity-body. If the resource has not changed, the Last-Modified date is the same as it was, and the condition If-Modified-Since fails. The server sends a response code of 304 (“Not Modified”) and no entity-body. That is, conditional HTTP GET succeeds if this condition fails.

Since Last-Modified is only accurate to within one second, conditional HTTP GET can occasionally give the wrong result if it relies only on If-Modified-Since. This is the main reason why we also use ETag and If-None-Match.

If-None-Match

Type: Request header.

Importance: Very high.

This header is also used in conditional HTTP GET. Its value is a previous value of the ETag response header, obtained from a previous request to this URI. If the ETag has changed since that last request, the condition If-None-Match succeeds and the server sends the new entity-body. If the ETag is the same as before, the condition fails, and the server sends a response code of 304 (“Not Modified”) with no entity-body.

If-Range

Type: Request header.

Importance: Low.

This header is used to make a conditional partial GET request. The value of the header comes from the ETag or Last-Modified response header from a previous range request. The server sends the new range only if that part of the entity-body has changed. Otherwise the server sends a 304 (“Not Modified”), even if something changed elsewhere in the entity-body.

Conditional partial GET is not used very often, because it’s very unlikely that a client will fetch a few bytes from a larger representation, and then try to fetch only those same bytes later.

If-Unmodified-Since

Type: Request header.

Importance: Medium.

Normally a client uses the value of the response header Last-Modified as the value of the request header If-Modified-Since to perform a conditional GET request. This header also takes the value of Last-Modified, but it’s usually used for making HTTP actions other than GET into conditional actions.

Let’s say you and many other people are interested in modifying a particular resource. You fetch a representation, modify it, and send it back with a PUT request. But someone else has modified it in the meantime, and you either get a response code of 409 (“Conflict”), or you put the resource into a state you didn’t intend.

If you make your PUT request conditional on If-Unmodified-Since, then if someone else has changed the resource your request will always get a response code of 412 (“Precondition Failed”). You can refetch the representation and decide what to do with the new version that someone else modified.

This header can be used with GET, too; see the Range header for an example.

Last-Modified

Type: Response header.

Importance: Very high.

This header makes conditional HTTP GET possible. It tells the client the last time the representation changed. The client can keep track of this date and use it in the If-Modified-Since header of a future request.

In web applications, Last-Modified is usually the current time, which makes conditional HTTP GET useless. Web services should try to do a little better, since web service clients often besiege their servers with requests for the same URIs over and over again. See Conditional GET” in Chapter 8 for ideas.

Location

Type: Response header.

Importance: Very high.

This is a versatile header with many related functions. It’s heavily associated with the 3xx (“Redirection”) response codes, and much of the confusion surrounding HTTP redirects has to do with how this header should be interpreted.

This header usually tells the client which URI it should be using to access a resource; presumably the client doesn’t already know. This might be because the client’s request created the resource—response code 201 (“Created”)—or caused the resource to change URIs—301 (“Moved Permanently”). It may also be because the client used a URI that’s not quite right, though not so wrong that the server didn’t recognize it. In that case the response code might be 301 again, or 307 (“Temporary Redirect”) or 302 (“Found”).

Sometimes the value of Location is just a default URI: one of many possible resolutions to an ambiguous request, e.g., 300 (“Multiple Choices”). Sometimes the value of Location points not to the resource the client tried to access, but to some other resource that provides supplemental information, e.g., 303 (“See Other”).

As you can see, this header can only be understood in the context of a particular HTTP response code. Refer to the appropriate section of Appendix B for more details.

Max-Forwards

Type: Request header.

Importance: Very low.

This header is mainly used with the TRACE method, which is used to track the proxies that handle a client’s HTTP request. I don’t cover TRACE in this book, but as part of a TRACE request, Max-Forwards is used to limit how many proxies the request can be sent through.

Pragma

Type: Request or response.

Importance: Very low.

The Pragma header is a spot for special directives between the client, server, and intermediaries such as proxies. The only official pragma is “no-cache,” which is obsolete in HTTP 1.1: it’s the same as sending a value of “no-cache” for the Cache-Control header. You may define your own HTTP pragmas, but it’s better to define your own HTTP headers instead. See, for instance, the X-Proxy-Directive header I made up while explaining the Connection header.

Proxy-Authenticate

Type: Response header.

Importance: Low to medium.

Some clients (especially in corporate environments) can only get HTTP access through a proxy server. Some proxy servers require authentication. This header is a proxy’s way of demanding authentication. It’s sent along with a response code of 407 (“Proxy Authentication Required”), and it works just like WWW-Authenticate, except it tells the client how to authenticate with the proxy, not with the web server on the other end. While the response to a WWW-Authenticate challenge goes into Authorization, the response to a Proxy-Authenticate challenge goes into Proxy-Authorization (see below). A single request may need to include both Authorization and Proxy-Authorization headers: one to authenticate with the web service, the other to authenticate with the proxy.

Since most web services don’t include proxies in their architecture, this header is not terribly relevant to the kinds of services covered in this book. But it may be relevant to a client, if there’s a proxy between the client and the rest of the web.

Proxy-Authorization

Type: Request header.

Importance: Low to medium.

This header is an attempt to get a request through a proxy that demands authentication. It works similarly to Authorization. Its format depends on the scheme defined in Proxy-Authenticate, just as the format of Authorization depends on the scheme defined in WWW-Authenticate.

Range

Type: Request.

Importance: Medium.

This header signifies the client’s attempt to request only part of a resource’s representation (see Partial GET” in Chapter 8). A client typically sends this header because it tried earlier to download a large representation and got cut off. Now it’s back for the rest of the representation. Because of this, this header is usually coupled with Unless-Modified-Since. If the representation has changed since your last request, you probably need to GET it from the beginning.

Referer

Type: Request header.

Importance: Low.

When you click a link in your web browser, the browser sends an HTTP request in which the value of the Referer header is the URI of the page you were just on. That’s the URI that “refered” your client to the URI you’re now requesting. Yes, it’s misspelled.

Though common on the human web, this header is rarely found on the programmable web. It can be used to convey a bit of application state (the client’s recent path through the service) to the server.

Retry-After

Type: Response header.

Importance: Low to medium.

This header usually comes with a response code that denotes failure: either 413 (“Request Entity Too Large”), or one of the 5xx series (“Server-side error”). It tells the client that while the server couldn’t fulfill the request right now, it might be able to fulfill the same request at a later time. The value of the header is the time when the client should try again, or the number of seconds it should wait.

If a server chooses every client’s Retry-After value using the same rules, that just guarantees the same clients will make the same requests in the same order a little later, possibly causing the problem all over again. The server should use some randomization technique to vary Retry-After, similar to Ethernet’s backoff period.

TE

Type: Request header.

Importance: Low.

This is another “Accept”-type header, one that lets the client specify which transfer encodings it will accept (see Transfer-Encoding below for an explanation of transfer encodings). HTTP: The Definitive Guide by Brian Totty and David Gourley (O’Reilly) points out that a better name would have been “Accept-Transfer-Encoding.”

In practice, the value of TE only conveys whether or not the client understands chunked encoding and HTTP trailers, two topics I don’t really cover in this book.

Trailer

Type: Response header.

Importance: Low.

When a server sends an entity-body using chunked transfer encoding, it may choose to put certain HTTP headers at the end of the entity-body rather than before it (see below for details). This turns them from headers into trailers. The server signals that it’s going to send a header as a trailer by putting its name as the value of the header called Trailer. Here’s one possible value for Trailer:

Trailer: Content-Length

The server will be providing a value for Content-Length once it’s served the entity-body and it knows how many bytes it served.

Transfer-Encoding

Type: Response.

Importance: Low.

Sometimes a server needs to send an entity-body without knowing important facts like how large it is. Rather than omitting HTTP headers like Content-Length and Content-MD5, the server may decide to send the entity-body in chunks, and put Content-Length and the like at the after of the entity-body rather than before. The idea is that by the time all the chunks have been sent, the server knows the things it didn’t know before, and it can send Content-Length and Content-MD5 as “trailers” instead of “headers.”

It’s an HTTP 1.1 requirement that clients support chunked transfer-encoding, but I don’t know of any programmable clients (as opposed to web browsers) that do.

Upgrade

Type: Request header.

Importance: Very low.

If you’d rather be using some protocol other than HTTP, you can tell the server that by sending a Upgrade header. If the server happens to speak the protocol you’d rather be using, it will send back a response code of 101 (“Switching Protocols”) and immediately begin speaking the new protocol.

There is no standard format for this list, but the sample Upgrade header from RFC 2616 shows what the designers of HTTP had in mind:

Upgrade: HTTP/2.0, SHTTP/1.3, IRC/6.9, RTA/x11

User-Agent

Type: Request header.

Importance: High.

This header lets the server know what kind of software is making the HTTP request. On the human web this is a string that identifies the brand of web browser. On the programmable web it usually identifies the HTTP library or client library that was used to write the client. It may identify a specific client program instead.

Soon after the human web became popular, servers started sniffing User-Agent to determine what kind of browser was on the other end. They then sent different representations based on the value of User-Agent. Elsewhere in this book I’ve voiced my opinion that it’s not a great idea to have request headers like Accept-Language be the only way a client can distinguish between different representations of the same resource. Sending different representations based on the value of User-Agent is an even worse idea. Not only has User-Agent sniffing perpetuated incompatibilities between web browsers, it’s led to an arms race inside the User-Agent header itself.

Almost every browser these days pretends to be Mozilla, because that was the internal code-name of the first web browser to become popular (Netscape Navigator). A browser that doesn’t pretend to be Mozilla may not get the representation it needs. Some pretend to be both Mozilla and MSIE, so they can trigger code for the current most popular web browser (Internet Explorer). A few browsers even allow the user to select the User-Agent for every request, to trick servers into sending the right representations.

Don’t let this happen to the programmable web. A web service should only use User-Agent to gather statistics and to deny access to poorly-programmed clients. It should not use User-Agent to tailor its representations to specific clients.

Vary

Type: Response header.

Importance: Low to medium.

The Vary header tells the client which request headers it can vary to get different representations of a resource. Here’s a sample value:

Vary: Accept Accept-Language

That value tells the client that it can ask for the representation in a different file format, by setting or changing the Accept header. It can ask for the representation in a different language, by setting or changing Accept-Language.

That value also tells a cache to cache (say) the Japanese representation of the resource separately from the English representation. The Japanese representation isn’t a brand new byte stream that invalidates the cached English version. The two requests sent different values for a header that varies (Accept-Language), so the responses should be cached separately. If the value of Vary is “*”, that means that the response should not be cached.

Via

Type: Request and response header.

Importance: Low.

When an HTTP request goes directly from the client to the server, or a response goes directly from server to client, there is no Via header. When there are intermediaries (like proxies) in the way, each one slaps on a Via header on the request or response message. The recipient of the message can look at the Via headers to see the path the HTTP message took through the intermediaries.

Warning

Type: Response header (can technically be used with requests).

Importance: Low.

The Warning header is a supplement to the HTTP response code. It’s usually inserted by an intermediary like a caching proxy, to tell the user about possible problems that aren’t obvious from looking at the response.

Like response codes, each HTTP warning has a three-digit numeric value: a “warn-code.” Most warnings have to do with cache behavior. This Warning says that the caching proxy at localhost:9090 sent a cached response even though it knew the response to be stale:

Warning: 110 localhost:9090 Response is stale

The warn-code 110 means “Response is stale” as surely as the HTTP response code 404 means “Not Found.” The HTTP standard defines seven warn-codes, which I won’t go into here.

WWW-Authenticate

Type: Response header.

Importance: Very high.

This header accompanies a response code of 401 (“Unauthorized”). It’s the server’s demand that the client send some authentication next time it requests the URI. It also tells the client what kind of authentication the server expects. This may be HTTP Basic auth, HTTP Digest auth, or something more exotic like WSSE.

The best content for your career. Discover unlimited learning on demand for around $1/day.