Reviewing Resource Sizing

Before getting started with Flex Gateway, analyze and choose your configuration to obtain the best performance and auto-scaling rate.

When assigning resources for a replica node, consider:

Number of APIs
Amount of memory the node needs
Processing power for the node
Number of policies in each API
Amount of traffic each node handles
Amount of time the node takes to answer the request (latency)
Type of traffic the node handles, and within the traffic:
- The number of requests per second (RPS) the node handles
- Size of each request (bytes)

Choose Appropriate Resources

The following table describes the resources you must allocate to your replica node according to your environment:

Node Size	CPU	RAM
Small	1-2 cores	2-4 GB
Medium	2-4 cores	4-8 GB
Large	8-16 cores	16-32 GB

Node Size

CPU

RAM

Small

1-2 cores

2-4 GB

Medium

2-4 cores

4-8 GB

Large

8-16 cores

16-32 GB

For IBM Power10 deployments, a "Large" node is sized 5-16 CPU cores.

Nodes belonging to a cluster must be of the same size.

Choose Your Cluster Size

The following table describes appropriate node and cluster sizes:

Node Size	Number of APIs	Number of Policies	Expected Latency	Expected Throughput	Target Environment
Small	< 100	4	< 30 ms	< 500 RPS	Dev/test cluster
Medium	< 500	4	< 20 ms	< 2500 RPS	Production cluster
Large	> 500	4	< 10 ms	< 10000 RPS	Mission-critical clusters

Node Size

Number of APIs

Number of Policies

Expected Latency

Expected Throughput

Target Environment

Small

< 100

< 30 ms

< 500 RPS

Dev/test cluster

Medium

< 500

< 20 ms

< 2500 RPS

Production cluster

Large

> 500

< 10 ms

< 10000 RPS

Mission-critical clusters

The figures above are per node. API configurations are replicated to all nodes in the cluster. For example, if you have 10 APIs deployed with 3 policies for each API, it means that the 10 APIs and their policies are deployed to each node in the cluster.

The traffic is distributed across all nodes in the cluster. A small cluster node supports up to 500 RPS, but if the cluster has two nodes then the total traffic that the cluster allows is 1000 RPS (500 RPS for each node). Exceeding the recommended traffic per node eventually causes increased CPU usage. To avoid this issue, you must increase the size of all nodes in the cluster or scale horizontally.

High CPU usage can lead to increased latency. In such a case, proceed with horizontal scaling or increase the node size.

High memory consumption can lead to out-of-memory errors. To solve this issue, increase the memory allocation or the node size. Horizontal scaling might not help solve issues in this situation. Horizontal scaling fails if high memory utilization is the problem. You must increase memory first, and scale horizontally later.

The guiding table considers a cluster with an average request size of 1kB and 4 policies implemented. Adjust your parameters depending on your use case.

The most commonly used out-of-the-box policies, which are considered in these guidelines, are:

Message logging
Rate limiting
JWT validation
Header injection

Using TLS affects the response time and the process consumption.

Reviewing Resource Sizing

Choose Appropriate Resources

Choose Your Cluster Size

See Also