HTTP Headers

It is interesting to examine the HTTP headers of requests and responses flowing through the caches. To get this information, I temporarily modified Squid to write a short binary record that indicates which headers are present. I also tracked the Cache-control directives.

The headers log file does not include URLs, so I cannot eliminate the popularity effects. There is one entry for each request from and each response to a client, so this data is from the client’s point of view.

Client Request Headers

Table A-2 lists the request headers and their frequency of occurrence. It’s important to keep in mind that most of these requests come from child caches, not from web browsers. Furthermore, most of the child caches are also running Squid. Evidence of this is seen in the occurrence of Via and X-Forwarded-For headers. Both of these are added by proxies, and the latter is an extension header used by Squid. According to this data, around 99% of all requests come from child caches.

Table A-2. Client Request Headers (IRCache Data)

Header% OccurrenceHeader% Occurrence
Host 99.91 Range 0.46
User-Agent 99.21 Connection 0.26
Via 98.90 From 0.24
Accept 98.84 Date 0.18
Cache-Control 98.34 Proxy-Authorization 0.07
X-Forwarded-For 98.19 Request-Range 0.06
Accept-Language 91.33 If-Range 0.05
Referer 85.00 Expires 0.02
Accept-Encoding 82.60 Mime-Version 0.01
Proxy-Connection 78.46 Content-Encoding 0.00
Cookie 39.18 Location 0.00
Accept-Charset 28.77 If-Match 0.00
If-Modified-Since ...

Get Web Caching now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.