CDN to the Rescue

The solution to our bandwidth issues was to use a content delivery network or CDN. A CDN hosts files in multiple places on the Internet for quicker retrieval by client computers. The files are often loaded into a CDN's network on an on-demand basis. It works in some ways like a caching reverse proxy. For example, a request from a client comes in to the CDN for an image. If the object exists at the CDN, it is returned to the client. If not, the origin server (our servers) is asked for the data. The CDN then keeps a copy based on the caching headers our servers provide. The majority of the bandwidth being used during the Yahoo! events consisted of images, JavaScript, and CSS. These types of data are great for offloading to a CDN.

There are some negatives with a CDN. For instance, you lose a bit of control over your data. You can't simply change a file and upload it with the same name, as the CDN's servers have old copies and will not ask for a new copy of the object until it expires. Also, you have to change the way you think about your static content. CDNs don't work well with very short timeouts. Anything less than 30 minutes can lead to very erratic request patterns.

Get Web Operations now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.