Chapter 19. Load Balancing at the Frontend

We serve many millions of requests every second and, as you may have already guessed, we use more than a single computer to handle this demand. But even if we did have a supercomputer that was somehow able to handle all these requests (imagine the network connectivity such a configuration would require!), we still wouldn’t employ a strategy that relied upon a single point of failure; when you’re dealing with large-scale systems, putting all your eggs in one basket is a recipe for disaster.

This chapter focuses on high-level load balancing—how we balance user traffic between datacenters. The following chapter zooms in to explore how we implement load balancing inside a datacenter.

Power Isn’t the Answer

For the sake of argument, let’s assume we have an unbelievably powerful machine and a network that never fails. Would that configuration be sufficient to meet Google’s needs? No. Even this configuration would still be limited by the physical constraints associated with our networking infrastructure. For example, the speed of light is a limiting factor on the communication speeds for fiber optic cable, which creates an upper bound on how quickly we can serve data based upon the distance it has to travel. Even in an ideal world, relying on an infrastructure with a single point of failure is a bad idea.

In reality, Google has thousands of machines and even more users, many of whom issue ...

Get Site Reliability Engineering now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.