Benchmarking Tools

A number of tools are available for benchmarking proxy caches. Some are self-contained because they generate all requests and responses internally. Others rely on trace log files for requests and on live origin servers for responses. Each technique has advantages and disadvantages.

Using trace files is attractive because the client and server programs are simpler to implement. A self-contained benchmark is more complicated because it uses mathematical formulas to generate new requests and responses. For example, a particular request has some probability of being a cache hit, of being cachable, and of being a certain size. With trace files, instead of managing complex workload parameters, the client just reads a file of URLs and sends HTTP requests. In essence, the workload parameters are embedded in the log files. Another problem is that trace files don’t normally record all the information needed to correctly play back the requests. For example, a log file doesn’t normally say if a particular request was on a persistent connection. It’s also unlikely to indicate certain request headers, such as Cache-control and If-Modified-Since.

Trace log files are usually taken from production proxy caches. This is good because the trace represents real web traffic on your network, generated by real users. If you want to run a trace-based benchmark or simulation but don’t have any log files, you might be out of luck. Log files are not usually shared between organizations ...

Get Web Caching now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.