Benchmarking a proxy cache is a complicated endeavor. Problems arise at all layers of the networking model, from the physical to the application. Bottlenecks or inefficiencies can appear in many places. The following sections describe some common problems I have observed while performing benchmarks.
TCP includes a mechanism known as delayed ACKs [Clark, 1982; IETF, 1989]. The idea is to not immediately acknowledge every data packet. ACK-only packets are very small and thus not particularly efficient. If the TCP stack waits a little while, there is a chance that a data packet is headed in the same direction. Piggybacking the ACK with the data is much more efficient. Most TCP implementations delay ACKs for up to 200 milliseconds. For some, the timeout is configurable.
Delayed ACKs are a big win for interactive flows (e.g., telnet) where small packets flow in both directions in bursts. HTTP, however, is largely unidirectional. HTTP requests normally fit inside a single TCP packet and, therefore, they are not affected by delayed ACKs. Responses, however, typically require many packets. At the beginning of the transfer, TCP slow start is also in effect. This means the sender won’t transmit the second packet until the first one is acknowledged. Thus, clients that use delayed ACKs increase the response time of each request by about 100–200 milliseconds.
Most operating systems allow you to disable delayed ACKs. There is a tradeoff in doing so, however. ...