The Application Layer: HTTP

Recall that standard HTTP requests and proxy-HTTP requests are slightly different (see Section 2.1). The first line of a standard request normally includes only an absolute pathname. Proxy-HTTP requests, on the other hand, use the full URL. Because interception proxying does not require browser configuration, and the browser thinks it is connected directly to an origin server, it sends only the URL-path in the HTTP request line. The URL-path does not include the origin server hostname, so the cache must determine the origin server hostname by some other means.

The most reliable way to determine the origin server is from the HTTP/1.1 Host header. Fortunately, all of the recent browser products do send the Host header, even if they use HTTP/1.0 in the request line. Thus, it is a relatively simple matter for the cache to transform this standard request:

GET /index.html HTTP/1.0
Host: www.ircache.net

into a proxy-HTTP request, such as:

GET http://www.ircache.net/index.html HTTP/1.0

In the absence of the Host header, the cache might be able to use the socket interface to get the IP address for which the packet was originally destined. The Unix sockets interface allows an application to retrieve the local address of a connected socket with the getsockname() function. In this case, the local address is the origin server that the proxy pretends to be. Whether this actually works depends on how the operating system implements the packet redirection. The native ...

Get Web Caching now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.