Cover by Sam Ruby, Leonard Richardson

Safari, the world’s most comprehensive technology and business learning platform.

Find the exact information you need to solve a problem on the fly, or go deeper to master the technologies and skills you need to succeed

Start Free Trial

No credit card required

O'Reilly logo

Making the Request: HTTP Libraries

Every modern programming language has one or more libraries for making HTTP requests. Not all of these libraries are equally useful, though. To build a fully general web service client you need an HTTP library with these features:

  • It must support HTTPS and SSL certificate validation. Web services, like web sites, use HTTPS to secure communication with their clients. Many web services ( is one example) won’t accept plain HTTP requests at all. A library’s HTTPS support often depends on the presense of an external SSL library written in C.

  • It must support at least the five main HTTP methods: GET, HEAD, POST, PUT, and DELETE. Some libraries support only GET and POST. Others are designed for simplicity and support only GET.

    You can get pretty far with a client that only supports GET and POST: HTML forms support only those two methods, so the entire human web is open to you. You can even do all right with just GET, because many web services (among them and Flickr) use GET even where they shouldn’t. But if you’re choosing a library for all your web service clients, or writing a general client like a WADL client, you need a library that supports all five methods. Additional methods like OPTIONS and TRACE, and WebDAV extensions like MOVE, are a bonus.

  • It must allow the programmer to customize the data sent as the entity-body of a PUT or POST request.

  • It must allow the programmer to customize a request’s HTTP headers.

  • It must give the programmer access to the response code and headers of an HTTP response; not just access to the entity-body.

  • It must be able to communicate through an HTTP proxy. The average programmer may not think about this, but many HTTP clients in corporate environments can only work through a proxy. Intermediaries like HTTP proxies are also a standard part of the REST meta-architecture, though not one I’ll be covering in much detail.

Optional Features

There are also some features of an HTTP library that make life easier as you write clients for RESTful and hybrid services. These features mostly boil down to knowledge about HTTP headers, so they’re technically optional. You can implement them yourself so long as your library gives you access to request and response HTTP headers. The advantage of library support is that you don’t have to worry about the details.

  • An HTTP library should automatically request data in compressed form to save bandwidth, and transparently decompress the data it receives. The HTTP request header here is Accept-Encoding, and the response header is Encoding. I discuss these in more detail in Chapter 8.

  • It should automatically cache the responses to your requests. The second time you request a URI, it should return an item from the cache if the object on the server hasn’t changed. The HTTP headers here are ETag and If-Modified-Since for the request, and Etag and Last-Modified for the response. These, too, I discuss in Chapter 8.

  • It should transparently support the most common forms of HTTP authentication: Basic, Digest, and WSSE. It’s useful to support custom, company-specific authentication methods such as Amazon’s, or to have plug-ins that support them.

    The request header is Authorization and the response header (the one that demands authentication) is WWW-Authenticate. I cover the standard HTTP authentication methods, plus WSSE, in Chapter 8. I cover Amazon’s custom authentication method in Chapter 3.

  • It should be able to transparently follow HTTP redirects, while avoiding infinite redirects and redirect loops. This should be an optional convenience for the user, rather than something that happens on every single redirect. A web service may reasonably send a status code of 303 (“See Other”) without implying that the client should go fetch that other URI right now!

  • It should be able to parse and create HTTP cookie strings, rather than forcing the programmer to manually set the Cookie header. This is not very important for RESTful services, which shun cookies, but it’s very important if you want to use the human web.

When you’re writing code against a specific service, you may be able to do without some or all of these features. Ruby’s standard open-uri library only supports GET requests. If you’re writing a client for, there’s no problem, since that web service expects only GET requests. But try to use open-uri with Amazon S3 (which uses GET, HEAD, PUT, and DELETE), and you’ll quickly run into a wall. In the next sections I recommend good HTTP client libraries for some popular programming languages.

Ruby: rest-open-uri and net/http

Ruby comes with two HTTP client libraries, open-uri and the lower-level net/http. Either can make HTTPS requests if you’ve got the net/https extension installed. Windows installations of Ruby should be able to make HTTPS requests out of the box. If you’re not on Windows, you may have to install net/https separately.[8]

The open-uri library has a simple and elegant interface that lets you treat URIs as filenames. To read a web page, you simply open its URI and read data from the “filehandle.” You can pass in a hash to open containing custom HTTP headers and open-specific keyword arguments. This lets you set up a proxy, or specify authentication information.

Unfortunately, right now open-uri only supports one HTTP method: GET. That’s why I’ve made some minor modifications to open-uri and made the result available as the rest-open-uri Ruby gem.[9] I’ve added two keyword arguments to open: method, which lets you customize the HTTP method, and :body, which lets you send data in the entity-body.

Example 2-4 is an implementation of the standard example using the open-uri library (rest-open-uri works the same way). This code parses the response document using the REXML::Document parser, which you’ve seen before.

Example 2-4. A Ruby client using open-uri

#!/usr/bin/ruby -w
# delicious-open-uri.rb

require 'open-uri'
require 'rexml/document'

# Fetches a user's recent bookmarks, and prints each one.
def print_my_recent_bookmarks(username, password)
  # Make the HTTPS request.
  response = open('',
                  :http_basic_authentication => [username, password])

  # Read the response entity-body as an XML document.
  xml =

  # Turn the document into a data structure.
  document =

  # For each bookmark...
  REXML::XPath.each(document, "/posts/post") do |e|
    # Print the bookmark's description and URI
    puts "#{e.attributes['description']}: #{e.attributes['href']}"

# Main program
username, password = ARGV
unless username and password
  puts "Usage: #{$0} [username] [password]"
print_my_recent_bookmarks(username, password)

I mentioned earlier that Ruby’s stock open-uri can only make HTTP GET requests. For many purposes, GET is enough, but if you want to write a Ruby client for a fully RESTful service like Amazon’s S3, you’ll either need to use rest-open-uri, or turn to Ruby’s low-level HTTP library: net/http.

This built-in library provides the Net::HTTP class, which has several methods for making HTTP requests (see Table 2-1). You can build a complete HTTP client out of this class, using nothing more than the Ruby standard library. In fact, open-uri and rest-open-uri are based on Net::HTTP. Those libraries only exist because Net::HTTP provides no simple, easy-to-use interface that supports all the features a REST client needs (proxies, HTTPS, headers, and so on). That’s why I recommend you use rest-open-uri.

Table 2-1. HTTP feature matrix for Ruby HTTP client libraries

HTTP verbsGETAllAll
Custom dataNoYesYes
Custom headersYesYesYes
Auth methodsBasicBasicBasic

[a] Assuming the net/https library is installed

Python: httplib2

The Python standard library comes with two HTTP clients: urllib2, which has a file-like interface like Ruby’s open-uri; and httplib, which works more like Ruby’s Net::HTTP. Both offer transparent support for HTTPS, assuming your copy of Python was compiled with SSL support. There’s also an excellent third-party library, Joe Gregorio’s httplib2, which is the one I recommend in general. httplib2 is an excellent piece of software, supporting nearly every feature on my wish list—most notably, transparent caching. Table 2-2 lists the features available in each library.

Table 2-2. HTTP feature matrix for Python HTTP client libraries

Custom dataYesYesYes
Custom headersYesYesYes
Auth methodsBasic, DigestNoneBasic, Digest, WSSE, Google
CookiesYes (Use urllib2.build_opener(HTTPCookieProcessor))NoNo

[a] Assuming Python was compiled with SSL support

Example 2-5 is a client that uses httplib2. It uses the ElementTree library to parse the XML.

Example 2-5. A client in Python

import sys
from xml.etree import ElementTree
import httplib2

# Fetches a user's recent bookmarks, and prints each one.
def print_my_recent_bookmarks(username, password):
    client = httplib2.Http(".cache")
    client.add_credentials(username, password)

    # Make the HTTP request, and fetch the response and the entity-body.
    response, xml = client.request('')

    # Turn the XML entity-body into a data structure.
    doc = ElementTree.fromstring(xml)

    # Print information about every bookmark.
    for post in doc.findall('post'):
        print "%s: %s" % (post.attrib['description'], post.attrib['href'])

# Main program
if len(sys.argv) != 3:    
    print "Usage: %s [username] [password]" % sys.argv[0]

username, password = sys.argv[1:]
print_my_recent_bookmarks(username, password)

Java: HttpClient

The Java standard library comes with an HTTP client, You can get an instance by calling open on a object. Though it supports most of the basic features of HTTP, programming to its API is very difficult. The Apache Jakarta project has a competing client called HttpClient, which has a better design. There’s also Restlet. I cover Restlet as a server library in Chapter 12, but it’s also an HTTP client library. The class org.restlet.Client makes it easy to make simple HTTP requests, and the class hides the HttpURLConnection programming necessary to make more complex requests. Table 2-3 lists the features available in each library.

Table 2-3. HTTP feature matrix for Java HTTP client libraries.

HTTP verbsAllAllAll
Custom dataYesYesYes
Custom headersYesYesYes
Auth methodsBasic, Digest, NTLMBasic, Digest, NTLMBasic, Amazon

Example 2-6 is a Java client for that uses HttpClient. It works in Java 1.5 and up, and it’ll work in previous versions if you install the Xerces parser (see Java: javax.xml, Xerces, or XMLPull” later in this chapter).

Example 2-6. A client in Java


import org.apache.commons.httpclient.*;
import org.apache.commons.httpclient.auth.AuthScope;
import org.apache.commons.httpclient.methods.GetMethod;

import org.w3c.dom.*;
import org.xml.sax.SAXException;
import javax.xml.parsers.*;
import javax.xml.xpath.*;

 * A command-line application that fetches bookmarks from
 * and prints them to standard output.
public class DeliciousApp
  public static void main(String[] args)
    throws HttpException, IOException, ParserConfigurationException,
           SAXException, XPathExpressionException
    if (args.length != 2)
      System.out.println("Usage: java -classpath [CLASSPATH] "
                         + "DeliciousApp [USERNAME] [PASSWORD]");
      System.out.println("[CLASSPATH] - Must contain commons-codec, " +
                         "commons-logging, and commons-httpclient");
      System.out.println("[USERNAME]  - Your username");
      System.out.println("[PASSWORD]  - Your password");


    // Set the authentication credentials.
    Credentials creds = new UsernamePasswordCredentials(args[0], args[1]);
    HttpClient client = new HttpClient();
    client.getState().setCredentials(AuthScope.ANY, creds);

    // Make the HTTP request.
    String url = "";
    GetMethod method = new GetMethod(url);
    InputStream responseBody = method.getResponseBodyAsStream();

    // Turn the response entity-body into an XML document.
    DocumentBuilderFactory docBuilderFactory =
    DocumentBuilder docBuilder = 
    Document doc = docBuilder.parse(responseBody);

    // Hit the XML document with an XPath expression to get the list
    // of bookmarks.
    XPath xpath = XPathFactory.newInstance().newXPath();        
    NodeList bookmarks = (NodeList)xpath.evaluate("/posts/post", doc,

    // Iterate over the bookmarks and print out each one.
    for (int i = 0; i < bookmarks.getLength(); i++)
       NamedNodeMap bookmark = bookmarks.item(i).getAttributes();
       String description = bookmark.getNamedItem("description")
       String uri = bookmark.getNamedItem("href").getNodeValue();
       System.out.println(description + ": " + uri);


C#: System.Web.HTTPWebRequest

The .NET Common Language Runtime (CLR) defines HTTPWebRequest for making HTTP requests, and NetworkCredential for authenticating the client to the server. The HTTPWebRequest constructor takes a URI. The NetworkCredential constructor takes a username and password (see Example 2-7).

Example 2-7. A client in C#

using System;
using System.IO;
using System.Net;
using System.Xml.XPath;

public class DeliciousApp {
    static string user = "username";
    static string password = "password";
    static Uri uri = new Uri("");

    static void Main(string[] args) {
        HttpWebRequest request = (HttpWebRequest) WebRequest.Create(uri);
        request.Credentials = new NetworkCredential(user, password);
        HttpWebResponse response = (HttpWebResponse) request.GetResponse();

        XPathDocument xml = new
        XPathNavigator navigator = xml.CreateNavigator();
        foreach (XPathNavigator node in navigator.Select("/posts/post")) {
          string description = node.GetAttribute("description","");
          string href = node.GetAttribute("href","");
          Console.WriteLine(description + ": " + href);

PHP: libcurl

PHP comes with a binding to the C library libcurl, which can do pretty much anything you might want to do with a URI (see Example 2-8).

Example 2-8. A client in PHP

  $user = "username";
  $password = "password";

  $request = curl_init();
  curl_setopt($request, CURLOPT_URL,
  curl_setopt($request, CURLOPT_USERPWD, "$user:$password");
  curl_setopt($request, CURLOPT_RETURNTRANSFER, true);

  $response = curl_exec($request);
  $xml = simplexml_load_string($response);

  foreach ($xml->post as $post) {
    print "$post[description]: $post[href]\n";

JavaScript: XMLHttpRequest

If you’re writing a web service client in JavaScript, you probably intend it to run inside a web browser as part of an Ajax application. All modern web browsers implement a HTTP client library for JavaScript called XMLHttpRequest.

Because Ajax clients are developed differently from standalone clients, I’ve devoted an entire chapter to them: Chapter 11. The first example in that chapter is a client, so you can skip there right now without losing the flow of the examples.

The Command Line: curl

This example is a bit different: it doesn’t use a programming language at all. A program called curl is a capable HTTP client that runs from the Unix or Windows command line. It supports most HTTP methods, custom headers, several authentication mechanisms, proxies, compression, and many other features. You can use curl to do quick one-off HTTP requests, or use it in conjunction with shell scripts. Here’s curl in action, grabbing a user’s bookmarks:

$ curl
<?xml version='1.0' standalone='yes'?>
<posts tag="" user="username">

Other Languages

I don’t have the space or the expertise to cover every popular programming language in depth with a client example. I can, however, give brief pointers to HTTP client libraries for some of the many languages I haven’t covered yet.


Flash applications, like JavaScript applications, generally run inside a web browser. This means that when you write an ActionScript web service client you’ll probably use the Ajax architecture described in Chapter 11, rather than the standalone architecture shown in this chapter.

ActionScript’s XML class gives functionality similar to JavaScript’s XmlHttpRequest. The XML.load method fetches a URI and parses the response document into an XML data structure. ActionScript also provides a class called LoadVars, which works on form-encoded key-value pairs instead of on XML documents.


The libwww library for C was the very first HTTP client library, but most C programmers today use libcurl, the basis for the curl command-line tool. Earlier I mentioned PHP’s bindings to libcurl, but there are also bindings for more than 30 other languages. If you don’t like my recommendations, or I don’t mention your favorite programming language in this chapter, you might look at using the libcurl bindings.


Use libcurl, either directly or through an object-oriented wrapper called cURLpp.

Common Lisp

simple-http is easy to use, but doesn’t support anything but basic HTTP GET and POST. The AllegroServe web server library includes a complete HTTP client library.


The standard HTTP library for Perl is libwww-perl (also known as LWP), available from CPAN or most Unix packaging systems. libwww-perl has a long history and is one of the best-regarded Perl libraries. To get HTTPS support, you should also install the Crypt:SSLeay module (available from CPAN).

[8] On Debian GNU/Linux and Debian-derived systems like Ubuntu, the package name is libopenssl-ruby. If your packaging system doesn’t include net/https, you’ll have to download it from and install it by hand.

[9] For more information on Ruby gems, see Once you have the gem program installed, you can install rest-open-uri with the command gem install rest-open-uri. Hopefully my modifications to open-uri will one day make it into the core Ruby code, and the rest-open-uri gem will become redundant.

Find the exact information you need to solve a problem on the fly, or go deeper to master the technologies and skills you need to succeed

Start Free Trial

No credit card required