Like any other highly available service, an API deluged with traffic can crash or deliver poor performance. How can you prevent that? Traffic floods can come from poorly coded applications or from malicious attacks. Whatever the origin of the flood, you have to safeguard the system from excessive traffic. API practitioners refer to the various ways to control API traffic as traffic management. Other terms such as “rate limiting” and “throttling” are also used by various people. In this section we attempt to propose some specific terms for different purposes.
There are both business and technical motives for controlling the amount of traffic that an API handles. Rather than lump them all together we’ll break them into three parts:
Quotas limit the amount of traffic to an API for business reasons. We introduced quotas in Chapter 5 and will elaborate here.
Throttling delays the responses to certain API calls in order to put a limit on throughput.
Spike arresting stops traffic spikes that might be caused by buggy clients or by attackers.
All APIs should have some level of spike arresting if for no other reason than to protect from disaster. Whether quotas or throttling should be deployed depends on the business requirements of the API.
There are a number of solid business reasons to control the amount of traffic that is accepted by an API. These include:
The desire to offer tiered API access, with less access for new applications ...