You are previewing High Performance Web Sites.

High Performance Web Sites

Cover of High Performance Web Sites by Steve Souders Published by O'Reilly Media, Inc.
  1. High Performance Web Sites
    1. SPECIAL OFFER: Upgrade this ebook with O’Reilly
    2. Praise for High Performance Web Sites
    3. Foreword
    4. Preface
      1. How This Book Is Organized
      2. Conventions Used in This Book
      3. Code Examples
      4. Comments and Questions
      5. Safari® Books Online
      6. Acknowledgments
    5. 1. The Importance of Frontend Performance
      1. Tracking Web Page Performance
      2. Where Does the Time Go?
      3. The Performance Golden Rule
    6. 2. HTTP Overview
      1. Compression
      2. Conditional GET Requests
      3. Expires
      4. Keep-Alive
      5. There's More
    7. 3. Rule 1: Make Fewer HTTP Requests
      1. Image Maps
      2. CSS Sprites
      3. Inline Images
      4. Combined Scripts and Stylesheets
      5. Conclusion
    8. 4. Rule 2: Use a Content Delivery Network
      1. Content Delivery Networks
      2. The Savings
    9. 5. Rule 3: Add an Expires Header
      1. Expires Header
      2. Max-Age and mod_expires
      3. Empty Cache vs. Primed Cache
      4. More Than Just Images
      5. Revving Filenames
      6. Examples
    10. 6. Rule 4: Gzip Components
      1. How Compression Works
      2. What to Compress
      3. The Savings
      4. Configuration
      5. Proxy Caching
      6. Edge Cases
      7. Gzip in Action
    11. 7. Rule 5: Put Stylesheets at the Top
      1. Progressive Rendering
      2. sleep.cgi
      3. Blank White Screen
      4. Flash of Unstyled Content
      5. What's a Frontend Engineer to Do?
    12. 8. Rule 6: Put Scripts at the Bottom
      1. Problems with Scripts
      2. Parallel Downloads
      3. Scripts Block Downloads
      4. Worst Case: Scripts at the Top
      5. Best Case: Scripts at the Bottom
      6. Putting It in Perspective
    13. 9. Rule 7: Avoid CSS Expressions
      1. Updating Expressions
      2. Working Around the Problem
      3. Conclusion
    14. 10. Rule 8: Make JavaScript and CSS External
      1. Inline vs. External
      2. Typical Results in the Field
      3. Home Pages
      4. The Best of Both Worlds
    15. 11. Rule 9: Reduce DNS Lookups
      1. DNS Caching and TTLs
      2. The Browser's Perspective
      3. Reducing DNS Lookups
    16. 12. Rule 10: Minify JavaScript
      1. Minification
      2. Obfuscation
      3. The Savings
      4. Examples
      5. Icing on the Cake
    17. 13. Rule 11: Avoid Redirects
      1. Types of Redirects
      2. How Redirects Hurt Performance
      3. Alternatives to Redirects
    18. 14. Rule 12: Remove Duplicate Scripts
      1. Duplicate Scripts—They Happen
      2. Duplicate Scripts Hurt Performance
      3. Avoiding Duplicate Scripts
    19. 15. Rule 13: Configure ETags
      1. What's an ETag?
      2. The Problem with ETags
      3. ETags: Use 'Em or Lose 'Em
      4. ETags in the Real World
    20. 16. Rule 14: Make Ajax Cacheable
      1. Web 2.0, DHTML, and Ajax
      2. Asynchronous = Instantaneous?
      3. Optimizing Ajax Requests
      4. Caching Ajax in the Real World
    21. 17. Deconstructing 10 Top Sites
      1. Page Weight, Response Time, YSlow Grade
      2. How the Tests Were Done
      3. Amazon
      4. AOL
      5. CNN
      6. eBay
      7. Google
      8. MSN
      9. MySpace
      10. Wikipedia
      11. Yahoo!
      12. YouTube
    22. Index
    23. About the Author
    24. Colophon
    25. SPECIAL OFFER: Upgrade this ebook with O’Reilly

Chapter 4. Rule 2: Use a Content Delivery Network

The average user's bandwidth increases every year, but a user's proximity to your web server still has an impact on a page's response time. Web startups often have all their servers in one location. If they survive the startup phase and build a larger audience, these companies face the reality that a single server location is no longer sufficient—it's necessary to deploy content across multiple, geographically dispersed servers.

As a first step to implementing geographically dispersed content, don't attempt to redesign your web application to work in a distributed architecture. Depending on the application, a redesign could include daunting tasks such as synchronizing session state and replicating database transactions across server locations. Attempts to reduce the distance between users and your content could be delayed by, or never pass, this redesign step.

The correct first step is found by recalling the Performance Golden Rule, described in Chapter 1:

Only 10–20% of the end user response time is spent downloading the HTML document. The other 80–90% is spent downloading all the components in the page.

If the application web servers are closer to the user, the response time of one HTTP request is improved. On the other hand, if the component web servers are closer to the user, the response times of many HTTP requests are improved. Rather than starting with the difficult task of redesigning your application in order to disperse the application web servers, it's better to first disperse the component web servers. This not only achieves a bigger reduction in response times, it's also easier thanks to content delivery networks.

Content Delivery Networks

A content delivery network (CDN) is a collection of web servers distributed across multiple locations to deliver content to users more efficiently. This efficiency is typically discussed as a performance issue, but it can also result in cost savings. When optimizing for performance, the server selected for delivering content to a specific user is based on a measure of network proximity. For example, the CDN may choose the server with the fewest network hops or the server with the quickest response time.

Some large Internet companies own their own CDN, but it's cost effective to use a CDN service provider. Akamai Technologies, Inc. is the industry leader. In 2005, Akamai acquired Speedera Networks, the primary low-cost alternative. Mirror Image Internet, Inc. is now the leading alternative to Akamai. Limelight Networks, Inc. is another competitor. Other providers, such as SAVVIS Inc., specialize in niche markets such as video content delivery.

Table 4-1 shows 10 top Internet sites in the U.S. and the CDN service providers they use.

You can see that:

  • Five use Akamai

  • One uses Mirror Image

  • One uses Limelight

  • One uses SAVVIS

  • Four either don't use a CDN or use a homegrown CDN solution

Smaller and noncommercial web sites might not be able to afford the cost of these CDN services. There are several free CDN services available. Globule ( is an Apache module developed at Vrije Universiteit in Amsterdam. CoDeeN ( was built at Princeton University on top of PlanetLab. CoralCDN ( is run out of New York University. They are deployed in different ways. Some require that end users configure their browsers to use a proxy. Others require developers to change the URL of their components to use a different hostname. Be wary of any that use HTTP redirects to point users to a local server, as this slows down web pages (see Chapter 13).

In addition to improved response times, CDNs bring other benefits. Their services include backups, extended storage capacity, and caching. A CDN can also help absorb spikes in traffic, for example, during times of peak weather or financial news, or during popular sporting or entertainment events.

One drawback to relying on a CDN is that your response times can be affected by traffic from other web sites, possibly even those of your competitors. A CDN service provider typically shares its web servers across all its clients. Another drawback is the occasional inconvenience of not having direct control of the content servers. For example, modifying HTTP response headers must be done through the service provider rather than directly by your ops team. Finally, if your CDN service provider's performance degrades, so does yours. In Table 4-1, you can see that eBay and MySpace each use two CDN service providers, a smart move if you want to hedge your bets.

CDNs are used to deliver static content, such as images, scripts, stylesheets, and Flash. Serving dynamic HTML pages involves specialized hosting requirements: database connections, state management, authentication, hardware and OS optimizations, etc. These complexities are beyond what a CDN provides. Static files, on the other hand, are easy to host and have few dependencies. That is why a CDN is easily leveraged to improve the response times for a geographically dispersed user population.

The best content for your career. Discover unlimited learning on demand for around $1/day.