You are previewing Web Performance Daybook Volume 2.

Web Performance Daybook Volume 2

Cover of Web Performance Daybook Volume 2 by Stoyan Stefanov Published by O'Reilly Media, Inc.
  1. Web Performance Daybook, Volume 2
  2. SPECIAL OFFER: Upgrade this ebook with O’Reilly
  3. Foreword
  4. From the Editor
  5. About the Authors
    1. Patrick Meenan
    2. Nicholas Zakas
    3. Guy Podjarny
    4. Stoyan Stefanov
    5. Tim Kadlec
    6. Brian Pane
    7. Josh Fraser
    8. Steve Souders
    9. Betty Tso
    10. Israel Nir
    11. Marcel Duran
    12. Éric Daspet
    13. Alois Reitbauer
    14. Matthew Prince
    15. Buddy Brewer
    16. Alexander Podelko
    17. Estelle Weyl
    18. Aaron Peters
    19. Tony Gentilcore
    20. Matthew Steele
    21. Bryan McQuade
    22. Tobie Langel
    23. Billy Hoffman
    24. Joshua Bixby
    25. Sergey Chernyshev
    26. JP Castro
    27. Pavel Paulau
    28. David Calhoun
    29. Nicole Sullivan
    30. James Pearce
    31. Tom Hughes-Croucher
    32. Dave Artz
  6. Preface
    1. Conventions Used in This Book
    2. Using Code Examples
    3. Safari® Books Online
    4. How to Contact Us
  7. 1. WebPagetest Internals
    1. Function Interception
    2. Code Injection
    3. Resulting Browser Architecture
    4. Get the Code
    5. Browser Advancements
  8. 2. localStorage Read Performance
    1. The Benchmark
    2. What’s Going On?
    3. Optimization Strategy
    4. Follow Up
  9. 3. Why Inlining Everything Is NOT the Answer
    1. No Browser Caching
    2. No Edge Caching
    3. No Loading On-Demand
    4. Invalidates Browser Look-Ahead
    5. Flawed Solution: Inline Everything only on First Visit
    6. Summary and Recommendations
  10. 4. The Art and Craft of the Async Snippet
    1. The Facebook Plug-ins JS SDK
    2. Design Goals
    3. The Snippet
    4. Appending Alternatives
    5. Whew!
    6. What’s Missing?
    7. First Parties
    8. Parting Words: On the Shoulders of Giants
  11. 5. Carrier Networks: Down the Rabbit Hole
    1. Variability
    2. Latency
    3. Transcoding
    4. Gold in Them There Hills
    5. 4G Won’t Save Us
    6. Where Do We Go from Here?
    7. Light at the End of the Tunnel
  12. 6. The Need for Parallelism in HTTP
    1. Introduction: Falling Down the Stairs
    2. Current Best Practices: Working around HTTP
    3. Experiment: Mining the HTTP Archive
    4. Results: Serialization Abounds
    5. Recommendations: Time to Fix the Protocols
  13. 7. Automating Website Performance
  14. 8. Frontend SPOF in Beijing
    1. Business Insider
    2. CNET
    3. O’Reilly Radar
    4. The Cause of Frontend SPOF
    5. Avoiding Frontend SPOF
    6. Call to Action
  15. 9. All about YSlow
  16. 10. Secrets of High Performance Native Mobile Applications
    1. Keep an Eye on Your Waterfalls
    2. Compress Those Resources
    3. Don’t Download the Same Content Twice
    4. Can Too Much Adriana Lima Slow You Down?
    5. Epilogue
  17. 11. Pure CSS3 Images? Hmm, Maybe Later
    1. The Challenge
    2. Getting My Hands Dirty with CSS3 Cooking
    3. Cross-Browser Results
    4. Benchmarking
      1. Payload
      2. Rendering
    5. Are We There Yet?
    6. Appendix: Code Listings
      1. HTML
      2. CSS
  18. 12. Useless Downloads of Background Images in Android
    1. The Android Problem
    2. And the Lack of Solution
  19. 13. Timing the Web
    1. Conclusion
  20. 14. I See HTTP
    1. icy
    2. Some details
    3. Walkthrough
    4. Todos
    5. The Road Ahead
    6. All I Want for Christmas…
  21. 15. Using Intelligent Caching to Avoid the Bot Performance Tax
  22. 16. A Practical Guide to the Navigation Timing API
    1. Why You Should Care
    2. Collecting Navigation Timing Timestamps and Turning Them into Useful Measurements
    3. Using Google Analytics as a Performance Data Warehouse
    4. Reporting on Performance in Google Analytics
    5. Limitations
    6. Final Thoughts
  23. 17. How Response Times Impact Business
  24. 18. Mobile UI Performance Considerations
    1. Battery Life
    2. Latency
    3. Embedding CSS and JS: A Best Practice?
    4. Memory
      1. Optimize Images
      2. Weigh the Benefits of CSS
      3. GPU Benefits and Pitfalls
      4. Viewport: Out of Sight Does Not Mean Out of Mind
      5. Minimize the DOM
    5. UI Responsiveness
    6. Summary
  25. 19. Stop Wasting Your Time Using the Google Analytics Site Speed Report
    1. Problem: A Bug in Firefox Implementation of the Navigation Timing API
    2. Solution: Filter Out the Firefox Timings in Google Analytics
    3. Good News: The Bug Was Fixed in Firefox 9
    4. Closing Remark
  26. 20. Beyond Web Developer Tools: Strace
    1. What About Other Platforms?
    2. Getting Started
    3. Zeroing In
    4. Example: Local Storage
    5. We’ve Only Scratched the Surface
  27. 21. Introducing mod_spdy: A SPDY Module for the Apache HTTP Server
    1. Getting Started with mod_spdy
    2. SPDY and Apache
    3. Help to Improve mod_spdy
  28. 22. Lazy Evaluation of CommonJS Modules
    1. Close Encounters of the Text/JavaScript Type
    2. Lazy Loading
    3. Lazy Evaluation to the Rescue
    4. Building Lazy Evaluation into CommonJS Modules
  29. 23. Advice on Trusting Advice
  30. 24. Why You’re Probably Reading Your Performance Measurement Results Wrong (At Least You’re in Good Company)
    1. The Methodology
    2. The Results
    3. Conclusions
    4. Why Does This Matter?
    5. Takeaways
  31. 25. Lossy Image Compression
    1. Lossy Compression
  32. 26. Performance Testing with Selenium and JavaScript
    1. Recording Data
    2. Collecting and Analyzing the Data
    3. Sample Results
    4. Benefits
    5. Closing Words
    6. Credits
  33. 27. A Simple Way to Measure Website Performance
    1. Concept
    2. Advantages
    3. Limitation
    4. Conclusion
  34. 28. Beyond Bandwidth: UI Performance
    1. Introduction
    2. After the Page Loads: The UI Layer
    3. UI Profilers
      1. CSS Stress Test
      2. CSS Profilers
      3. CSS Lint
      4. DOM Monster
    4. Perception of Speed
    5. Tidbits
    6. Call for a Focus on UI Performance
  35. 29. CSS Selector Performance Has Changed! (For the Better)
    1. Style Sharing
    2. Rule Hashes
    3. Ancestor Filters
    4. Fast Path
    5. What Is It Still Slow?
  36. 30. Losing Your Head with PhantomJS and confess.js
    1. Performance Summaries
    2. App Cache Manifest
    3. Onward and Upward
  37. 31. Measure Twice, Cut Once
    1. Identifying Pages/Sections
    2. Identifying Features
    3. Optimizing
  38. 32. When Good Backends Go Bad
    1. What Is a Good Backend Time?
    2. Figuring Out What Is Going On
    3. Fixing It
    4. Finally
  39. 33. Web Font Performance: Weighing @font-face Options and Alternatives
    1. Font Hosting Services Versus Rolling Your Own
    2. What the FOUT?
    3. Removing Excess Font Glyphs
    4. JavaScript Font Loaders
      1. Introducing Boot.getFont: A Fast and Tiny Web Font Loader
    5. Gentlefonts, Start Your Engines!
      1. My Observations
    6. Final Thoughts
  40. About the Author
  41. Colophon
  42. SPECIAL OFFER: Upgrade this ebook with O’Reilly
  43. Copyright
O'Reilly logo

Chapter 1. WebPagetest Internals

Patrick Meenan

I thought I’d take the opportunity this year to give a little bit of visibility into how WebPagetest gathers the performance data from browsers. Other tools on windows use similar techniques but the information here may not be representative of how other tools work.

First off, it helps to understand the networking stack on Windows from a browser’s perspective (Figure 1-1).

Windows networking stack from browser’s perspective

Figure 1-1. Windows networking stack from browser’s perspective

It doesn’t matter what the browser is, if it runs on Windows, the architecture pretty much has to look like the diagram above where all of the communications go through the Windows socket APIs (for that matter, just about any application that talks TCP/IP on Windows looks like the picture above).

Function Interception

The key to how WebPagetest works is its ability to intercept arbitrary function calls and inspect or alter the request or response before passing it on to the original implementation (or choosing not to pass it on at all). Luckily someone else did most of the heavy lifting and provided a nice open source library ( that can take care of the details for you but it basically works like this:

  • Find the target function in memory (trivial if it is exported from a dll).

  • Copy the first several bytes from the function (making sure to keep x86 instructions intact).

  • Overwrite the function entry with a jmp to the new function.

  • Provide a replacement function that includes the bytes copied from the original function along with a jmp to the remaining code.

It’s pretty hairy stuff and things tend to go very wrong if you aren’t extremely careful, but with well-defined functions (like all of the Windows APIs), you can pretty much intercept anything you’d like.

One catch is that you can only redirect calls to code running in the same process as the original function, which is fine if you wrote the code but doesn’t help a lot if you are trying to spy on software that you don’t control which leads us to…

Code Injection

Lucky for me, Windows provides several ways to inject arbitrary code into processes. There is a good overview of several different techniques here:, and there are actually more ways to do it than that but it covers the basics. Some of the techniques insert your code into every process but I wanted to be a lot more targeted and just instrument the specific browser instances that we are interested in, so after a bunch of experimentation (and horrible failures), I ended up using the CreateRemoteThread/LoadLibrary technique which essentially lets you force any process to load an arbitrary dll and execute code in it (assuming you have the necessary rights).

Resulting Browser Architecture

Now that we can intercept arbitrary function calls, it just becomes a matter of identifying the “interesting” functions, preferably ones that are used by all the browsers so you can reuse as much code as possible. In WebPagetest, we intercept all the Winsock calls that have to do with resolving host names, connecting sockets, and reading or writing data (Figure 1-2).

Browser architecture

Figure 1-2. Browser architecture

This gives us access to all the network access from the browser and we essentially just keep track of what the browsers are doing. Other than having to decode the raw byte streams, it is pretty straightforward and gives us a consistent way to do the measurements across all browsers. SSL does add a bit of a wrinkle so we also intercept calls to the various SSL libraries that the browsers use in order that we can see the unencrypted version of the data. This is a little more difficult for Chrome since the library is compiled into the Chrome code itself, but luckily they make debug symbols available for every build so we can still find the code in memory.

The same technique is used to intercept drawing calls from the browser so we can tell when it paints to the screen (for the start render measurement).

Get the Code

Since WebPagetest is under a BSD license you are welcome to reuse any of the code for whatever purposes you’d like. The project lives on Google Code here: ( and some of the more interesting files are:

Browser Advancements

Luckily, browsers are starting to expose more interesting information in standard ways and as the W3C Resource Timing spec ( advances, you will be able to access a lot of this information directly from the browser through JavaScript (even from your end users!).


To comment on this chapter, please visit Originally published on Dec 01, 2011.

The best content for your career. Discover unlimited learning on demand for around $1/day.