Cover by Sam Ruby, Leonard Richardson

Safari, the world’s most comprehensive technology and business learning platform.

Find the exact information you need to solve a problem on the fly, or go deeper to master the technologies and skills you need to succeed

Start Free Trial

No credit card required

O'Reilly logo

HTTP: Documents in Envelopes

If I was classifying marine animals I’d start by talking about the things they have in common: DNA, cellular structure, the laws of embryonic development. Then I’d show how animals distinguish themselves from each other by specializing away from the common ground. To classify the programmable web, I’d like to start off with an overview of HTTP, the protocol that all web services have in common.

HTTP is a document-based protocol, in which the client puts a document in an envelope and sends it to the server. The server returns the favor by putting a response document in an envelope and sending it to the client. HTTP has strict standards for what the envelopes should look like, but it doesn’t much care what goes inside. Example 1-5 shows a sample envelope: the HTTP request my web browser sends when I visit the homepage of oreilly.com. I’ve truncated two lines to make the text fit on the printed page.

Example 1-5. An HTTP GET request for http://www.oreilly.com/index.html

GET /index.html HTTP/1.1
Host: www.oreilly.com
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12)...
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,...
Accept-Language: us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-15,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive

In case you’re not familiar with HTTP, now is a good time to point out the major parts of the HTTP request. I use these terms throughout the book.

The HTTP method

In this request, the method is “GET.” In other discussions of REST you may see this called the “HTTP verb” or “HTTP action.”

The name of the HTTP method is like a method name in a programming language: it indicates how the client expects the server to process this envelope. In this case, the client (my web browser) is trying to GET some information from the server (www.oreilly.com).

The path

This is the portion of the URI to the right of the hostname: here, http://www.oreilly.com/index.html becomes “/index.html.” In terms of the envelope metaphor, the path is the address on the envelope. In this book I sometimes refer to the “URI” as shorthand for just the path.

The request headers

These are bits of metadata: key-value pairs that act like informational stickers slapped onto the envelope. This request has eight headers: Host, User-Agent, Accept, and so on. There’s a standard list of HTTP headers (see Appendix C), and applications can define their own.

The entity-body, also called the document or representation

This is the document that inside the envelope. This particular request has no entity-body, which means the envelope is empty! This is typical for a GET request, where all the information needed to complete the request is in the path and the headers.

The HTTP response is also a document in a envelope. It’s almost identical in form to the HTTP request. Example 1-6 shows a trimmed version of what the server at oreilly.com sends my web browser when I make the request in Example 1-5.

Example 1-6. The response to an HTTP GET request for http://www.oreilly.com/index.html

HTTP/1.1 200 OK
Date: Fri, 17 Nov 2006 15:36:32 GMT
Server: Apache
Last-Modified: Fri, 17 Nov 2006 09:05:32 GMT
Etag: "7359b7-a7fa-455d8264
Accept-Ranges: bytes
Content-Length: 43302
Content-Type: text/html
X-Cache: MISS from www.oreilly.com
Keep-Alive: timeout=15, max=1000
Connection: Keep-Alive

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
        "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
...
<title>oreilly.com -- Welcome to O'Reilly Media, Inc.</title>
...

The response can be divided into three parts:

The HTTP response code

This numeric code tells the client whether its request went well or poorly, and how the client should regard this envelope and its contents. In this case the GET operation must have succeeded, since the response code is 200 (“OK”). I describe the HTTP response codes in Appendix B.

The response headers

Just as with the request headers, these are informational stickers slapped onto the envelope. This response has 10 headers: Date, Server, and so on.

The entity-body or representation

Again, this is the document inside the envelope, and this time there actually is one! The entity-body is the fulfillment of my GET request. The rest of the response is just an envelope with stickers on it, telling the web browser how to deal with the document.

The most important of these stickers is worth mentioning separately. The response header Content-Type gives the media type of the entity-body. In this case, the media type is text/html. This lets my web browser know it can render the entity-body as an HTML document: a web page.

There’s a standard list of media types (http://www.iana.org/assignments/media-types/). The most common media types designate textual documents (text/html), structured data documents (application/xml), and images (image/jpeg). In other discussions of REST or HTTP, you may see the media type called the “MIME type,” “content type,” or “data type.”

Find the exact information you need to solve a problem on the fly, or go deeper to master the technologies and skills you need to succeed

Start Free Trial

No credit card required