Rate Limiting and Throttling - SLA-Based Policies

The Rate Limiting and Throttling - SLA-Based policies are client ID-based policies that use the ID as a reference to impose limits on the number of requests that each application can make within a period of time. To use these policies, create at least one SLA tier to define request limits as described in the tutorial. Require users to register an application through an API portal for engaging API users. The registration process grants the user a client ID and a client Secret.

You can configure an SLA tier to require manual approval of requests from users who want to register an app to call your API.

Rate Limiting - SLA-Based Policy

Rate limiting (not SLA-based) restricts the number of requests the API accepts from all applications within a certain time period. After reaching the limit, requests are rejected. Users don’t need to register applications or send Client IDs. SLA-based rate limiting restricts the number of requests by application to your API based on the configuration of an SLA tier. By default, the Client ID and token credentials are expected in the form of query parameters named client_id and client_secret. You can alter the Mule expression that points to this, and obtain these values from elsewhere in the HTTP message.

Throttling - SLA-Based Policy

SLA-based throttling works much like SLA-based rate limiting, except requests that exceed the SLA limits aren’t rejected, but instead are delayed and put in a queue. The queued requests are processed in the following period, or eventually rejected if they are queued repeatedly for an established number of times.

You can configure the interval between attempts during which requests are not accepted and you can limit the number of attempts.

SLA-Based Application Registration

If you create at least one SLA tier for an API, a user who requests access from a new application must choose an SLA Tier. You can offer multiple SLA tier choices based on different rate limits and throttling.

Managing API Access Requests

If the selected SLA Tier is set to automatic, then API Manager instantly approves all requests for API access, and users can immediately send authenticated requests to the API. If the selected SLA Tier was configured with manual approval, an admin of your organization has to approve it before a user can start to send valid requests to your API.

To approve API access requests, go to the API version page, and select the Applications tab in the bottom section of the screen.


There you can view details about your pending and processed requests and manage them.

Creating an SLA Tier or Layered SLAs

First, you create an SLA tier. If you choose manual approval, any application requests for access must be manually approved by an admin in your organization.

Multiple SLA limits are supported under the following conditions:

  • API Manager manages the API

  • API Gateway runtime 2.1 or later manages the API.

When an SLA tier with more than one limit is used for an API that runs on an API Gateway runtime earlier than 2.1, only the first defined limit is enforced.

You can specify multiple throughput limits for a single SLA tier. In the Add SLA Tier dialog, in Limits, set the first set of limits. For example:


Click the plus icon, and set the next set of limits.

A single SLA tier named gold can offer a limit of 100 requests per second as well as a limit of 10,000 requests per day. This ensures that applications registered on a gold tier don’t exceed a bursting limit of 100 request per second. Traffic bottlenecks are avoided, while ensuring that registered applications get the advertised SLA of 10,000 request per day. By unchecking the visible flag, the first limit of 100 requests is not exposed to application developers at registration time. When you configure only one limit, that limit must be visible.

Response Headers

Three headers are included in request responses that inform users about the SLA restrictions and inform them when nearing the threshold. When the SLA enforces multiple policies that limit request throughput, a single set of headers pertaining to the most restrictive of the policies provides this information.

For example, a user of your API may receive a response that includes these headers:

X-RateLimit-Limit: 20
X-RateLimit-Remaining: 14
X-RateLimit-Reset: 19100

Within the next 19100 milliseconds, only 14 more requests are allowed by the SLA, which is set to allow 20 within this time-window.

Rate Limiting and Throttling in a Mule cluster

The guidelines for rate limiting and throttling in a cluster applies to rate limiting and throttling for SLA-based policies.