The short answer is that caching saves money. It saves time as well, which is sometimes the same thing if you believe that “time is money.” But how does caching save you money?
It does so by providing a more efficient mechanism for distributing information on the Web. Consider an example from our physical world: the distribution of books. Specifically, think about how a book gets from publisher to consumer. Publishers print the books and sell them, in large quantities, to wholesale distributors. The distributors, in turn, sell the books in smaller quantities to bookstores. Consumers visit the stores and purchase individual books. On the Internet, web caches are analogous to the bookstores and wholesale distributors.
The analogy is not perfect, of course. Books cost money; web pages (usually) don’t. Books are physical objects, whereas web pages are just electronic and magnetic signals. It’s difficult to copy a book, but trivial to copy electronic data.
The point is that both caches and bookstores enable efficient distribution of their respective contents. An Internet without caches is like a world without bookstores. Imagine 100,000 residents of Los Angeles each buying one copy of Harry Potter and the Sorcerer’s Stone from the publisher in New York. Now imagine 50,000 Internet users in Australia each downloading the Yahoo! home page every time they access it. It’s much more efficient to transfer the page once, cache it, and then serve future requests directly from the cache.
In order for caching to be effective, the following conditions must be met:
Client requests must exhibit locality of reference.
The cost of caching must be less than the cost of direct retrieval.
We can intuitively conclude that the first requirement is true. Certain web sites are very popular. Classic examples are the starting pages for Netscape and Microsoft browsers. Others include searching and indexing sites such as Yahoo! and Altavista. Event-based sites, such as those for the Olympics, NASA’s Mars Pathfinder mission, and World Cup Soccer, become extremely popular for days or weeks at a time. Finally, every individual has a few favorite pages that he or she visits on a regular basis.
It’s not always obvious that the second requirement is true. We need to compare the costs of caching to the costs of not caching. Numerous factors enter into the analysis, some of which are easier to measure than others. To calculate the cost of caching, we can add up the costs for hardware, software, and staff time to administer the system. We also need to consider the time users save waiting for pages to load (latency) and the cost of Internet bandwidth.
Let’s take a closer look at the three primary benefits of caching web content:
To make web pages load faster (reduce latency)
To reduce wide area bandwidth usage
To reduce the load placed on origin servers
Latency refers to delays in the transmission of data from one point to another. The transmission of data over electrical or optical circuits is limited by the speed of light. In fact, electrical and optical pulses travel at approximately two-thirds the speed of light in wires and fibers. Theoretically, it takes at least 25milliseconds to send a packet across the U.S. In practice, it takes a little longer, say about 30milliseconds. Transoceanic delays are in the 100-millisecond range.
Another source of latency is network congestion. When network links are close to full utilization, packets experience queuing delays inside routers and switches. Queuing, which can occur at any number of points along a path, is occasionally a source of significant delay. When a device’s queue is full, it is forced to discard incoming (or outgoing) packets. With reliable protocols, such as TCP, lost packets are eventually retransmitted. TCP’s retransmission algorithms are much too complex to describe here. However, a relatively small amount of packet loss can result in a dramatic decrease in throughput.
A web cache located close to its users reduces latency for cache hits. Transmission delays are much lower because the systems are close to each other. Additionally, queuing and retransmission delays are less likely because fewer routers and links are involved. If the service is designed properly, a cache miss should not be delayed much longer than a direct transfer between the client and origin server. Thus, cache hits reduce the average latency of all requests.
To some extent, latency due to congestion can be eliminated by upgrading network links and/or hardware (faster switches, fatter pipes). In reality, you may find this option too costly. Very fast routers and high-speed, wide-area data circuits can be prohibitively expensive. Furthermore, you cannot upgrade equipment you do not control. For example, there is almost nothing a person living in Japan can do to increase the capacity of major exchange points in the United States. Installing a cache, however, avoids congested parts of the network for some web requests.
Content providers and online shopping companies have an incentive to develop cache-friendly web sites. A 1999 study by Zona Research concluded that e-commerce sites may have been losing up to US$362 million per month due to page loading delays and network failures [Zona Research, 1999]. In Chapter 6, we’ll talk about ways that content providers can make their servers cache-friendly.
Another reason to utilize web caches is bandwidth reduction. Every request that results in a cache hit saves bandwidth. If your Internet connection is congested, installing a cache is likely to improve performance for other applications (e.g., email, interactive games, streaming audio), because all your network applications compete for bandwidth. A web cache reduces the amount of bandwidth consumed by HTTP traffic, leaving a larger share for the others. It is also correct to say that a web cache increases your effective bandwidth. If your network supports 100 users without a cache, you can probably support 150 users with a cache.
Even if your Internet connection has plenty of spare bandwidth, you may still want to use a cache. In the United States, Internet service is typically billed at flat rates. Many other countries, however, have usage-based billing. That is, you pay only for the bandwidth you use. Australia and New Zealand were among the first countries to meter bandwidth usage. Not surprisingly, these countries were also the first to widely deploy web caches.
In the same way that caching reduces bandwidth, it also reduces the load imposed upon origin servers. A server’s response time usually increases as the request rate increases. An idle server is faster than a busy one. A very busy server is slow, regardless of the network conditions between the server and its clients. Banga and Druschel show specifically how web server performance degrades when overloaded [Banga and Druschel, 1997].
It seems strange that an ISP should install a caching proxy to reduce the load on content providers. In fact, some content providers don’t want caching proxies to reduce their request rates. As we’ll see in Chapter 3, they would rather receive all requests.
However, more and more content providers are using surrogates (commonly called server accelerators or reverse caches) to more efficiently distribute their data. Surrogates are used extensively by content distribution network (CDN) providers such as Akamai and Digital Island. A surrogate is much like a caching proxy, except that it works on behalf of an origin server rather than a user agent. This means that surrogates interpret some HTTP headers differently.
CDNs (and surrogates) are a good way for content providers to make their sites faster for end users. However, it’s not the only way. By carefully configuring their web servers, content providers can take full advantage of all the caching proxies on the Internet. As we’ll see in Chapter 6, it’s even possible to get accurate access counts while making most of the content cachable.