Advertising

We all know that electronic commerce has been a driving force behind the evolution of the Web. Advertising, in particular, is important because it generates revenue for a large number of web sites. Advertising fees are often based on the number of views or impressions. That is, the advertiser pays the web site some amount for every person who sees their ad. But how do the site owners and the advertisers know how many people have seen a particular ad?

The simplest approach is to count the number of accesses logged by the site’s HTTP server. As I’m sure you can guess, with caching in place, some of the requests for an advertisement never reach the origin server. Thus, the web site counts too few accesses and perhaps undercharges the advertiser. The advertiser might not mind being undercharged, but it is probably in everyone’s best interest to have accurate access counts. Later, in Section 6.4.2, I suggest some techniques that content providers can use to increase ad counting accuracy while remaining cache-friendly.

Some people take issue with the notion of counting ad impressions and other page accesses. The fact that something requests a page or image does not mean a human being actually views it. Search engines and other web robots can generate a large number of requests. The User-agent request header normally identifies the entity that issued the request. Thus, user requests can be differentiated from robot requests. Another tricky aspect of request counting is related ...

Get Web Caching now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.