O'Reilly logo

RESTful Web Services by Sam Ruby, Leonard Richardson

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

URIs

What makes a resource a resource? It has to have at least one URI. The URI is the name and address of a resource. If a piece of information doesn’t have a URI, it’s not a resource and it’s not really on the Web, except as a bit of data describing some other resource.

Remember the sample session in the Preface, when I was making fun of HTTP 0.9? Let’s say this is a HTTP 0.9 request for http://www.example.com/hello.txt:

Client requestServer response
GET /hello.txt
Hello, world!

An HTTP client manipulates a resource by connecting to the server that hosts it (in this case, www.example.com), and sending the server a method (“GET”) and a path to the resource (“/hello.txt”). Today’s HTTP 1.1 is a little more complex than 0.9, but it works the same way. Both the server and the path come from the resource’s URI.

Client requestServer response
GET /hello.txt HTTP/1.1
Host: www.example.com
200 OK
Content-Type: text/plain

Hello, world!

The principles behind URIs are well described by Tim Berners-Lee in Universal Resource Identifiers—Axioms of Web Architecture. In this section I expound the principles behind constructing URIs and assigning them to resources.

Tip

The URI is the fundamental technology of the Web. There were hypertext systems before HTML, and Internet protocols before HTTP, but they didn’t talk to each other. The URI interconnected all these Internet protocols into a Web, the way TCP/IP interconnected networks like Usenet, Bitnet, and CompuServe into a single Internet. Then the Web co-opted those other protocols and killed them off, just like the Internet did with private networks.

Today we surf the Web (not Gopher), download files from the Web (not FTP sites), search publications from the Web (not WAIS), and have conversations on the Web (not Usenet newsgroups). Version control systems like Subversion and arch work over the Web, as opposed to the custom CVS protocol. Even email is slowly moving onto the Web.

The web kills off other protocols because it has something most protocols lack: a simple way of labeling every available item. Every resource on the Web has at least one URI. You can stick a URI on a billboard. People can see that billboard, type that URI into their web browsers, and go right to the resource you wanted to show them. It may seem strange, but this everyday interaction was impossible before URIs were invented.

URIs Should Be Descriptive

Here’s the first point where the ROA builds upon the sparse recommendations of the REST thesis and the W3C recommendations. I propose that a resource and its URI ought to have an intuitive correspondence. Here are some good URIs for the resources I listed above:

  • http://www.example.com/software/releases/1.0.3.tar.gz

  • http://www.example.com/software/releases/latest.tar.gz

  • http://www.example.com/weblog/2006/10/24/0

  • http://www.example.com/map/roads/USA/AR/Little_Rock

  • http://www.example.com/wiki/Jellyfish

  • http://www.example.com/search/Jellyfish

  • http://www.example.com/nextprime/1024

  • http://www.example.com/next-5-primes/1024

  • http://www.example.com/sales/2004/Q4

  • http://www.example.com/relationships/Alice;Bob

  • http://www.example.com/bugs/by-state/open

URIs should have a structure. They should vary in predictable ways: you should not go to /search/Jellyfish for jellyfish and /i-want-to-know-about/Mice for mice. If a client knows the structure of the service’s URIs, it can create its own entry points into the service. This makes it easy for clients to use your service in ways you didn’t think of.

This is not an absolute rule of REST, as we’ll see in the Name the Resources” section of Chapter 5. URIs do not technically have to have any structure or predictability, but I think they should. This is one of the rules of good web design, and it shows up in RESTful and REST-RPC hybrid services alike.

The Relationship Between URIs and Resources

Let’s consider some edge cases. Can two resources be the same? Can two URIs designate the same resource? Can a single URI designate two resources?

By definition, no two resources can be the same. If they were the same, you’d only have one resource. However, at some moment in time two different resources may point to the same data. If the current software release is 1.0.3, then http://www.example.com/software/releases/1.0.3.tar.gz and http://www.example.com/software/releases/latest.tar.gz will refer to the same file for a while. But the ideas behind those two URIs are different: one of them always points to a particular version, and the other points to whatever version is newest at the time the client accesses it. That’s two concepts and two resources. You wouldn’t link to latest when reporting a bug in version 1.0.3.

A resource may have one URI or many. The sales numbers available at http://www.example.com/sales/2004/Q4 might also be available at http://www.example.com/sales/Q42004. If a resource has multiple URIs, it’s easier for clients to refer to the resource. The downside is that each additional URI dilutes the value of all the others. Some clients use one URI, some use another, and there’s no automatic way to verify that all the URIs refer to the same resource.

Tip

One way to get around this is to expose multiple URIs for the same resource, but have one of them be the “canonical” URI for that resource. When a client requests the canonical URI, the server sends the appropriate data along with response code of 200 (“OK”). When a client requests one of the other URIs, the server sends a response code 303 (“See Also”) along with the canonical URI. The client can’t see whether two URIs point to the same resource, but it can make two HEAD requests and see if one URI redirects to the other or if they both redirect to a third URI.

Another way is to serve all the URIs as though they were the same, but give the “canonical” URI in the Content-Location response header whenever someone requests a non-canonical URI.

Fetching sales/2004/Q4 might get you the same bytestream as fetching sales/Q42004, because they’re different URIs for the same resource: “sales for the last quarter of 2004.” Fetching releases/1.0.3.tar.gz might give you the exact same bytestream as fetching releases/latest.tar.gz, but they’re different resources because they represent different things: “version 1.0.3” and “the latest version.”

Every URI designates exactly one resource. If it designated more than one, it wouldn’t be a Universal Resource Identifier. However, when you fetch a URI the server may send you information about multiple resources: the one you requested and other, related ones. When you fetch a web page, it usually conveys some information of its own, but it also has links to other web pages. When you retrieve an S3 bucket with an Amazon S3 client, you get a document that contains information about the bucket, and information about related resources: the objects in the bucket.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required