Chapter 6. Auto Scaling and Elastic Load Balancing

Most applications have peaks and troughs of user activity. Consumer web applications are a good example. A website that is popular only in the United Kingdom is likely to experience very low levels of user activity at three o’clock in the morning, London time. Business applications also exhibit the same behavior: a company’s internal HR system will see high usage during business hours, and often very little traffic outside these times.

Capacity planning is the process of calculating which resources will be required to ensure that application performance remains at acceptable levels. A traditional datacenter environment needs enough capacity to satisfy peak demand, leading to wasted resources during lulls in activity. If your application requires ten servers to satisfy peak demand and only one server during quiet times, up to nine of those servers are regularly going to waste.

Because of the amount of time it takes to bring physical hardware online, traditional capacity planning must take future growth into consideration; otherwise, services would be subject to outages just as they become popular, and system administrators would spend more time ordering new hardware than configuring it. Getting these growth predictions wrong presents two risks: first, if your application fails to grow as much as expected, you have wasted a lot of money on hardware. Conversely, if you fail to anticipate explosive growth, you may find the continuation ...

Get AWS System Administration now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.