Cover by Sam Ruby, Leonard Richardson

Safari, the world’s most comprehensive technology and business learning platform.

Find the exact information you need to solve a problem on the fly, or go deeper to master the technologies and skills you need to succeed

Start Free Trial

No credit card required

O'Reilly logo

Method Information

HTTP is the one thing that all “animals” on the programmable web have in common. Now I’ll show you how web services distinguish themselves from each other. There are two big questions that today’s web services answer differently. If you know how a web service answers these questions, you’ll have a good idea of how well it works with the Web.

The first question is how the client can convey its intentions to the server. How does the server know a certain request is a request to retrieve some data, instead of a request to delete that same data or to overwrite it with different data? Why should the server do this instead of doing that?

I call the information about what to do with the data the method information. One way to convey method information in a web service is to put it in the HTTP method. Since this is how RESTful web services do it, I’ll have a lot more to say about this later. For now, note that the five most common HTTP methods are GET, HEAD, PUT, DELETE, and POST. This is enough to distinguish between “retrieve some data” (GET), “delete that same data” (DELETE), and “overwrite it with different data” (PUT).

The great advantage of HTTP method names is that they’re standardized. Of course, the space of HTTP method names is much more limited than the space of method names in a programming language. Some web services prefer to look for application-specific method names elsewhere in the HTTP request: usually in the URI path or the request document.

Example 1-7 is a client for a web service that keeps its method information in the path: the web service for Flickr, Yahoo!’s online photo-sharing application. This sample application searches Flickr for photos. To run this program, you’ll need to create a Flickr account and apply for an API key.

Example 1-7. Searching Flickr for pictures

#!/usr/bin/ruby -w
# flickr-photo-search.rb
require 'open-uri'
require 'rexml/document'

# Returns the URI to a small version of a Flickr photo.
def small_photo_uri(photo)
  server = photo.attribute('server')
  id = photo.attribute('id')
  secret = photo.attribute('secret')
  return "http://static.flickr.com/#{server}/#{id}_#{secret}_m.jpg"
end

# Searches Flickr for photos matching a certain tag, and prints a URI
# for each search result.
def print_each_photo(api_key, tag)
  # Build the URI
  uri = "http://www.flickr.com/services/rest?method=flickr.photos.search" +
    "&api_key=#{api_key}&tags=#{tag}"

  # Make the HTTP request and get the entity-body.
  response = open(uri).read

  # Parse the entity-body as an XML document.
  doc = REXML::Document.new(response)

  # For each photo found...
  REXML::XPath.each(doc, '//photo') do |photo| 
    # ...generate and print its URI
    puts small_photo_uri(photo) if photo
  end
end

# Main program
#
if ARGV.size < 2
  puts "Usage: #{$0} [Flickr API key] [search term]"
  exit
end

api_key, tag = ARGV
print_each_photo(api_key, tag)

This program makes HTTP requests to URIs like http://www.flickr.com/services/rest?method=flickr.photos.search&api_key=xxx&tag=penguins. How does the server know what the client is trying to do? Well, the method name is pretty clearly flickr.photos.search. Except: the HTTP method is GET, and I am getting information, so it might be that the method thing is a red herring. Maybe the method information really goes in the HTTP action.

This hypothesis doesn’t last for very long, because the Flickr API supports many methods, not just “get”-type methods such as flickr.photos.search and flickr.people.findByEmail, but also methods like flickr.photos.addTags, flickr.photos.comments.deleteComment, and so on. All of them are invoked with an HTTP GET request, regardless of whether or not they “get” any data. It’s pretty clear that Flickr is sticking the method information in the method query variable, and expecting the client to ignore what the HTTP method says.

By contrast, a typical SOAP service keeps its method information in the entity-body and in a HTTP header. Example 1-8 is a Ruby script that searches the Web using Google’s SOAP-based API.

Example 1-8. Searching the Web with Google’s search service

#!/usr/bin/ruby -w
# google-search.rb
require 'soap/wsdlDriver'

# Do a Google search and print out the title of each search result
def print_page_titles(license_key, query)
  wsdl_uri = 'http://api.google.com/GoogleSearch.wsdl'
  driver = SOAP::WSDLDriverFactory.new(wsdl_uri).create_rpc_driver
  result_set = driver.doGoogleSearch(license_key, query, 0, 10, true, ' ', 
                                     false, ' ', ' ', ' ')
  result_set.resultElements.each { |result| puts result.title }
end

# Main program.
if ARGV.size < 2
  puts "Usage: #{$0} [Google license key] [query]"
  exit
end

license_key, query = ARGV
print_page_titles(license_key, query)

Tip

While I was writing this book, Google announced that it was deprecating its SOAP search service in favor of a RESTful, resource-oriented service (which, unfortunately, is encumbered by legal restrictions on use in a way the SOAP service isn’t). I haven’t changed the example because Google’s SOAP service still makes the best example I know of, and because I don’t expect you to actually run this program. I just want you to look at the code, and the SOAP and WSDL documents the code relies on.

OK, that probably wasn’t very informative, because the WSDL library hides most of the details. Here’s what happens. When you call the doGoogleSearch method, the WSDL library makes a POST request to the “endpoint” of the Google SOAP service, located at the URI http://api.google.com/search/beta2. This single URI is the destination for every API call, and only POST requests are ever made to it. All of these details are in the WSDL file found at http://api.google.com/GoogleSearch.wsdl, which contains details like the definition of doGoogleSearch (Example 1-9).

Example 1-9. Part of the WSDL description for Google’s search service

<operation name="doGoogleSearch">
 <input message="typens:doGoogleSearch"/>
 <output message="typens:doGoogleSearchResponse"/>
</operation>

Since the URI and the HTTP method never vary, the method information—that “doGoogleSearch”—can’t go in either place. Instead, it goes into the entity-body of the POST request. Example 1-10 shows what HTTP request you might make to do a search for REST.

Example 1-10. A sample SOAP RPC call

POST search/beta2 HTTP/1.1
Host: api.google.com
Content-Type: application/soap+xml
SOAPAction: urn:GoogleSearchAction

<?xml version="1.0" encoding="UTF-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
 <soap:Body>
  <gs:doGoogleSearch xmlns:gs="urn:GoogleSearch">
   <q>REST</q>
   ...
  </gs:doGoogleSearch>
 </soap:Body>
</soap:Envelope>

The method information is “doGoogleSearch.” That’s the name of the XML tag inside the SOAP Envelope, it’s the name of the operation in the WSDL file, and it’s the name of the Ruby method in Example 1-8.

Let’s bring things full circle by considering not the Google SOAP search API, but the Google search engine itself. To use your web browser to search Google’s data set for REST, you’d send a GET request to http://www.google.com/search?q=REST and get an HTML response back. The method information is kept in the HTTP method: you’re GETting a list of search results.

Find the exact information you need to solve a problem on the fly, or go deeper to master the technologies and skills you need to succeed

Start Free Trial

No credit card required