You are previewing RESTful Web Services.

RESTful Web Services

Cover of RESTful Web Services by Leonard Richardson... Published by O'Reilly Media, Inc.
  1. RESTful Web Services
    1. SPECIAL OFFER: Upgrade this ebook with O’Reilly
    2. A Note Regarding Supplemental Files
    3. Foreword
    4. Preface
      1. The Web Is Simple
      2. Big Web Services Are Not Simple
      3. The Story of the REST
      4. Reuniting the Webs
      5. What’s in This Book?
      6. Administrative Notes
      7. Conventions Used in This Book
      8. Using Code Examples
      9. Safari® Enabled
      10. How to Contact Us
      11. Acknowledgments
    5. 1. The Programmable Web and Its Inhabitants
      1. Kinds of Things on the Programmable Web
      2. HTTP: Documents in Envelopes
      3. Method Information
      4. Scoping Information
      5. The Competing Architectures
      6. Technologies on the Programmable Web
      7. Leftover Terminology
    6. 2. Writing Web Service Clients
      1. Web Services Are Web Sites
      2. del.icio.us: The Sample Application
      3. Making the Request: HTTP Libraries
      4. Processing the Response: XML Parsers
      5. JSON Parsers: Handling Serialized Data
      6. Clients Made Easy with WADL
    7. 3. What Makes RESTful Services Different?
      1. Introducing the Simple Storage Service
      2. Object-Oriented Design of S3
      3. Resources
      4. HTTP Response Codes
      5. An S3 Client
      6. Request Signing and Access Control
      7. Using the S3 Client Library
      8. Clients Made Transparent with ActiveResource
      9. Parting Words
    8. 4. The Resource-Oriented Architecture
      1. Resource-Oriented What Now?
      2. What’s a Resource?
      3. URIs
      4. Addressability
      5. Statelessness
      6. Representations
      7. Links and Connectedness
      8. The Uniform Interface
      9. That’s It!
    9. 5. Designing Read-Only Resource-Oriented Services
      1. Resource Design
      2. Turning Requirements Into Read-Only Resources
      3. Figure Out the Data Set
      4. Split the Data Set into Resources
      5. Name the Resources
      6. Design Your Representations
      7. Link the Resources to Each Other
      8. The HTTP Response
      9. Conclusion
    10. 6. Designing Read/Write Resource-Oriented Services
      1. User Accounts as Resources
      2. Custom Places
      3. A Look Back at the Map Service
    11. 7. A Service Implementation
      1. A Social Bookmarking Web Service
      2. Figuring Out the Data Set
      3. Resource Design
      4. Design the Representation(s) Accepted from the Client
      5. Design the Representation(s) Served to the Client
      6. Connect Resources to Each Other
      7. What’s Supposed to Happen?
      8. What Might Go Wrong?
      9. Controller Code
      10. Model Code
      11. What Does the Client Need to Know?
    12. 8. REST and ROA Best Practices
      1. Resource-Oriented Basics
      2. The Generic ROA Procedure
      3. Addressability
      4. State and Statelessness
      5. Connectedness
      6. The Uniform Interface
      7. This Stuff Matters
      8. Resource Design
      9. URI Design
      10. Outgoing Representations
      11. Incoming Representations
      12. Service Versioning
      13. Permanent URIs Versus Readable URIs
      14. Standard Features of HTTP
      15. Faking PUT and DELETE
      16. The Trouble with Cookies
      17. Why Should a User Trust the HTTP Client?
    13. 9. The Building Blocks of Services
      1. Representation Formats
      2. Prepackaged Control Flows
      3. Hypermedia Technologies
    14. 10. The Resource-Oriented Architecture Versus Big Web Services
      1. What Problems Are Big Web Services Trying to Solve?
      2. SOAP
      3. WSDL
      4. UDDI
      5. Security
      6. Reliable Messaging
      7. Transactions
      8. BPEL, ESB, and SOA
      9. Conclusion
    15. 11. Ajax Applications as REST Clients
      1. From AJAX to Ajax
      2. The Ajax Architecture
      3. A del.icio.us Example
      4. The Advantages of Ajax
      5. The Disadvantages of Ajax
      6. REST Goes Better
      7. Making the Request
      8. Handling the Response
      9. JSON
      10. Don’t Bogart the Benefits of REST
      11. Cross-Browser Issues and Ajax Libraries
      12. Subverting the Browser Security Model
    16. 12. Frameworks for RESTful Services
      1. Ruby on Rails
      2. Restlet
      3. Django
    17. A. Some Resources for REST and Some RESTful Resources
      1. Standards and Guides
      2. Services You Can Use
    18. B. The HTTP Response Code Top 42
      1. Three to Seven Status Codes: The Bare Minimum
      2. 1xx: Meta
      3. 2xx: Success
      4. 3xx: Redirection
      5. 4xx: Client-Side Error
      6. 5xx: Server-Side Error
    19. C. The HTTP Header Top Infinity
      1. Standard Headers
      2. Nonstandard Headers
    20. Index
    21. About the Authors
    22. Colophon
    23. SPECIAL OFFER: Upgrade this ebook with O’Reilly
O'Reilly logo

Request Signing and Access Control

I’ve put it off as long as I can, and now it’s time to deal with S3 authentication. If your main interest is in RESTful services in general, feel free to skip ahead to the section on using the S3 library in clients. But if the inner workings of S3 have piqued your interest, read on.

The code I’ve shown you so far makes HTTP requests all right, but S3 rejects them, because they don’t contain the all-important Authorization header. S3 has no proof that you’re the owner of your own buckets. Remember, Amazon charges you for the data stored on their servers and the bandwidth used in transferring that data. If S3 accepted requests to your buckets with no authorization, anyone could store data in your buckets and you’d get charged for it.

Most web services that require authentication use a standard HTTP mechanism to make sure you are who you claim to be. But S3’s needs are more complicated. With most web services you never want anyone else using your data. But one of the uses of S3 is as a hosting service. You might want to host a big movie file on S3, let anyone download it with their BitTorrent client, and have Amazon send you the bill.

Or you might be selling access to movie files stored on S3. Your e-commerce site takes payment from a customer and gives them an S3 URI they can use to download the movie. You’re delegating to someone else the right to make a particular web service call (a GET request) as you, and have it charged to your account.

The standard mechanisms for HTTP authentication can’t provide security for that kind of application. Normally, the person who’s sending the HTTP request needs to know the actual password. You can prevent someone from spying on your password, but you can’t say to someone else: “here’s my password, but you must promise only to use it to request this one URI.”

S3 solves this problem using a message authentication code (MAC). Every time you make an S3 request, you use your secret key (remember, the secret is shared between you and Amazon) to sign the important parts of the request. That’d be the URI, the HTTP method you’re using, and a few of the HTTP headers. Only someone who knows the secret can create these signatures for your requests, which is how Amazon knows it’s okay to charge you for the request. But once you’ve signed a request, you can send the signature to a third party without revealing the secret. The third party is then free to send an identical HTTP request to the one you signed, and have Amazon charge you for it. In short: someone else can make a specific request as you, for a limited time, without having to know your secret.

There is a simpler way to give anonymous access to your S3 objects, and I discuss it below. But there’s no way around signing your own requests, so even a simple library like this one must support request signing if it’s going to work. I’m reopening the S3::Authorized Ruby module now. I’m going to give it the ability to intercept calls to the open method, and sign HTTP requests before they’re made. Since S3::BucketList, S3::Bucket, and S3::Object have all included this module, they’ll inherit this ability as soon as I define it. Without the code I’m about to write, all those open calls I defined in the classes above will send unsigned HTTP requests that just bounce off S3 with response code 403 (“Forbidden”). With this code, you’ll be able to generate signed HTTP requests that pass through S3’s security measures (and cost you money). The code in Example 3-15 and the other examples that follow is heavily based on Amazon’s own example S3 library.

Example 3-15. S3 Ruby client: the S3::Authorized module

module Authorized
  # These are the standard HTTP headers that S3 considers interesting
  # for purposes of request signing.
  INTERESTING_HEADERS = ['content-type', 'content-md5', 'date']

  # This is the prefix for custom metadata headers. All such headers
  # are considered interesting for purposes of request signing.
  AMAZON_HEADER_PREFIX = 'x-amz-'

  # An S3-specific wrapper for rest-open-uri's implementation of
  # open(). This implementation sets some HTTP headers before making
  # the request. Most important of these is the Authorization header,
  # which contains the information Amazon will use to decide who to
  # charge for this request.
  def open(uri, headers_and_options={}, *args, &block)
    headers_and_options = headers_and_options.dup
    headers_and_options['Date'] ||= Time.now.httpdate
    headers_and_options['Content-Type'] ||= ''   
    signed = signature(uri, headers_and_options[:method] || :get,
                       headers_and_options)
    headers_and_options['Authorization'] = "AWS #{@@public_key}:#{signed}"
    Kernel::open(uri, headers_and_options, *args, &block)
  end

The tough work here is in the signature method, not yet defined. This method needs to construct an encrypted string to go into a request’s Authorization header: a string that convinces the S3 service that it’s really you sending the request—or that you’ve authorized someone else to make the request at your expense (see Example 3-16).

Example 3-16. S3 Ruby client: the Authorized#signature module

  # Builds the cryptographic signature for an HTTP request. This is
  # the signature (signed with your secret key) of a "canonical
  # string" containing all interesting information about the request.
  def signature(uri, method=:get, headers={}, expires=nil)
    # Accept the URI either as a string, or as a Ruby URI object.
    if uri.respond_to? :path
      path = uri.path
    else
      uri = URI.parse(uri)
      path = uri.path + (uri.query ? "?" + query : "")
    end

    # Build the canonical string, then sign it.
    signed_string = sign(canonical_string(method, path, headers, expires))
  end

Well, this method passes the buck again, by calling sign on the result of canonical_string. Let’s look at those two methods, starting with canonical_string. It turns an HTTP request into a string that looks something like Example 3-17. That string contains everything interesting (from S3’s point of view) about an HTTP request, in a specific format. The interesting data is the HTTP method (PUT), the Content-type (“text/plain”), a date, a few other HTTP headers (“x-amz-metadata”), and the path portion of the URI (“/crummy.com/myobject”). This is the string that sign will sign. Anyone can create this string, but only the S3 account holder and Amazon know how to produce the correct signature.

Example 3-17. The canonical string for a sample request

PUT

text/plain
Fri, 27 Oct 2006 21:22:41 GMT
x-amz-metadata:Here's some metadata for the myobject object.
/crummy.com/myobject

When Amazon’s server receives your HTTP request, it generates the canonical string, signs it (again, Amazon knows your secret key), and sees whether the two signatures match. That’s how S3 authentication works. If the signatures match, your request goes through. Otherwise, you get a response code of 403 (“Forbidden”).

Example 3-18 shows the code to generate the canonical string.

Example 3-18. S3 Ruby client: the Authorized#canonical_string method

  # Turns the elements of an HTTP request into a string that can be
  # signed to prove a request comes from your web service account.
  def canonical_string(method, path, headers, expires=nil)

    # Start out with default values for all the interesting headers.
    sign_headers = {}
    INTERESTING_HEADERS.each { |header| sign_headers[header] = '' }

    # Copy in any actual values, including values for custom S3
    # headers.
    headers.each do |header, value|
      if header.respond_to? :to_str
        header = header.downcase
        # If it's a custom header, or one Amazon thinks is interesting...
        if INTERESTING_HEADERS.member?(header) ||
            header.index(AMAZON_HEADER_PREFIX) == 0
          # Add it to the header hash.
          sign_headers[header] = value.to_s.strip
        end
      end
    end
 
    # This library eliminates the need for the x-amz-date header that
    # Amazon defines, but someone might set it anyway. If they do,
    # we'll do without HTTP's standard Date header.
    sign_headers['date'] = '' if sign_headers.has_key? 'x-amz-date'

    # If an expiration time was provided, it overrides any Date
    # header. This signature will be valid until the expiration time,
    # not only during the single second designated by the Date header.
    sign_headers['date'] = expires.to_s if expires

    # Now we start building the canonical string for this request. We
    # start with the HTTP method.
    canonical = method.to_s.upcase + "\n"

    # Sort the headers by name, and append them (or just their values)
    # to the string to be signed.
    sign_headers.sort_by { |h| h[0] }.each do |header, value|
      canonical << header << ":" if header.index(AMAZON_HEADER_PREFIX) == 0
      canonical << value << "\n"
    end

    # The final part of the string to be signed is the URI path. We
    # strip off the query string, and (if necessary) tack one of the
    # special S3 query parameters back on: 'acl', 'torrent', or
    # 'logging'.
    canonical << path.gsub(/\?.*$/, '')

    for param in ['acl', 'torrent', 'logging']
      if path =~ Regexp.new("[&?]#{param}($|&|=)")
        canonical << "?" << param
        break
      end
    end
    return canonical
  end

The implementation of sign is just a bit of plumbing around Ruby’s standard cryptographic and encoding interfaces (see Example 3-19).

Example 3-19. S3 Ruby client: the Authorized#sign method

  # Signs a string with the client's secret access key, and encodes the
  # resulting binary string into plain ASCII with base64.
  def sign(str)
    digest_generator = OpenSSL::Digest::Digest.new('sha1')
    digest = OpenSSL::HMAC.digest(digest_generator, @@private_key, str)
    return Base64.encode64(digest).strip
  end

Signing a URI

My S3 library has one feature still to be implemented. I’ve mentioned a few times that S3 lets you sign an HTTP request and give the URI to someone else, letting them make that request as you. Here’s the method that lets you do this: signed_uri (see Example 3-20). Instead of making an HTTP request with open, you pass the open arguments into this method, and it gives you a signed URI that anyone can use as you. To limit abuse, a signed URI works only for a limited time. You can customize that time by passing a Time object in as the keyword argument :expires.

Example 3-20. S3 Ruby client: the Authorized#signed_uri method

  # Given information about an HTTP request, returns a URI you can
  # give to anyone else, to let them them make that particular HTTP
  # request as you. The URI will be valid for 15 minutes, or until the
  # Time passed in as the :expires option.
  def signed_uri(headers_and_options={}) 
    expires = headers_and_options[:expires] || (Time.now.to_i + (15 * 60))
    expires = expires.to_i if expires.respond_to? :to_i
    headers_and_options.delete(:expires) 
    signature = URI.escape(signature(uri, headers_and_options[:method], 
                                     headers_and_options, nil))
    q = (uri.index("?")) ? "&" : "?"
    "#{uri}#{q}Signature=#{signature}&Expires=#{expires}&AWSAccessKeyId=#{@@public_key}"
  end
end

end # Remember the all-encompassing S3 module? This is the end.

Here’s how it works. Suppose I want to give a customer access to my hosted file at https://s3.amazonaws.com/BobProductions/KomodoDragon.avi. I can run the code in Example 3-21 to generate a URI for my customer.

Example 3-21. Generating a signed URI

#!/usr/bin/ruby1.9
# s3-signed-uri.rb
require 'S3lib'

bucket = S3::Bucket.new("BobProductions")
object = S3::Object.new(bucket, "KomodoDragon.avi")
puts object.signed_uri
# "https://s3.amazonaws.com/BobProductions/KomodoDragon.avi
# ?Signature=J%2Fu6kxT3j0zHaFXjsLbowgpzExQ%3D
# &Expires=1162156499&AWSAccessKeyId=0F9DBXKB5274JKTJ8DG2"

That URI will be valid for 15 minutes, the default for my signed_uri implementation. It incorporates my key ID (AWSAccessKeyId), the expiration time (Expires), and the cryptographic Signature. My customer can visit this URI and download the movie file KomodoDragon.avi. Amazon will charge me for my customer’s use of their bandwidth. If my customer modifies any part of the URI (maybe they to try to download a second movie too), the S3 service will reject their request. An untrustworthy customer can send the URI to all of their friends, but it will stop working in 15 minutes.

You may have noticed a problem here. The canonical string usually includes the value of the Date header. When my customer visits the URI you signed, their web browser will surely send a different value for the Date header. That’s why, when you’re generating a canonical string to give to someone else, you set an expiration date instead of a request date. Look back to Example 3-18 and the implementation of canonical_string, where the expiration date (if provided) overwrites any value for the Date header.

Setting Access Policy

What if I want to make an object publicly accessible? I want to serve my files to the world and let Amazon deal with the headaches of server management. Well, I could set an expiration date very far in the future, and give out the enormous signed URI to everyone. But there’s an easier way to get the same results: allow anonymous access. You can do this by setting the access policy for a bucket or object, telling S3 to respond to unsigned requests for it. You do this by sending the x-amz-acl header along with the PUT request that creates the bucket or object.

That’s what the acl_policy argument to Bucket#put and Object#put does. If you want to make a bucket or object publicly readable or writable, you pass an appropriate value in for acl_policy. My client sends that value as part of the custom HTTP request header X-amz-acl. Amazon S3 reads this request header and sets the rules for bucket or object access appropriately.

The client in Example 3-22 creates an S3 object that anyone can read by visiting its URI at https://s3.amazonaws.com/BobProductions/KomodoDragon-Trailer.avi. In this scenario, I’m not selling my movies: just using Amazon as a hosting service so I don’t have to serve movies from my own web site.

Example 3-22. Creating a publicly-readable object

#!/usr/bin/ruby -w
# s3-public-object.rb
require 'S3lib'

bucket = S3::Bucket.new("BobProductions")
object = S3::Object.new(bucket, "KomodoDragon-Trailer.avi")
object.put("public-read")

S3 understands four access policies:

private

The default. Only requests signed by your “private” key are accepted.

public-read

Unsigned GET requests are accepted: anyone can download an object or list a bucket.

public-write

Unsigned GET and PUT requests are accepted. Anyone can modify an object, or add objects to a bucket.

authenticated-read

Unsigned requests are rejected, but read requests can be signed by the “private” key of any S3 user, not just your own. Basically, anyone with an S3 account can download your object or list your bucket.

There are also fine-grained ways of granting access to a bucket or object, which I won’t cover. If you’re interested, see the section “Setting Access Policy with REST” in the S3 technical documentation. That section reveals a parallel universe of extra resources. Every bucket /{name-of-bucket} has a shadow resource /{name-of-bucket}?acl corresponding to that bucket’s access control rules, and every object /{name-of-bucket}/{name-of-object} has a shadow ACL resource /{name-of-bucket}/{name-of-object}?acl. By sending PUT requests to these URIs, and including XML representations of access control lists in the request entity-bodies, you can set specific permissions and limit access to particular S3 users.

The best content for your career. Discover unlimited learning on demand for around $1/day.