Contact Us 1-800-596-4880

Configuring a Readiness or Liveness Probe

Flex Gateway supports two methods for ensuring reliability and efficiency.

  • Readiness probes test whether a gateway instance is configured correctly and ready for incoming traffic. This ensures that end-users are not affected by partially configured APIs. For information, refer to Configure a Readiness Probe.

  • Liveness probes test whether a gateway instance is operational. This minimizes downtime due to crashes or other failures. For information, refer to Configure a Liveness Probe.

Configure a Readiness Probe

To test whether a Flex Gateway instance is ready for incoming traffic, use the following CLI command:

flexctl probe --check=readiness

If the Flex Gateway instance is ready to receive incoming traffic, the command returns an exit code of 0. Otherwise, the command returns an exit code of 1.

When Flex Gateway starts, its readiness state is false. The state changes to true after the following conditions are met:

  • API instances are deployed (at least one API must be configured)

  • Policies are initialized

The readiness state is not impacted when you apply configuration changes to a running gateway instance. This ensures that the instance continues to receive traffic during configuration updates.

Refer to the following for information about configuring readiness probes in specific environments:

  • Kubernetes and OpenShift

  • Docker, Podman, and Linux

By default, Flex Gateway running on Kubernetes includes a readiness probe configured in the Helm chart. The readiness probe automatically runs flexctl probe --check=readiness --allow-api-errors --allow-envoy-errors --allow-policy-errors at an interval of 10 seconds.

The readiness probe has the following default values:

readinessProbe:
 exec:
   command:
   - flexctl
   - probe
   - --check=readiness
   - --allow-api-errors
   - --allow-envoy-errors
   - --allow-policy-errors
 initialDelaySeconds: 10
 periodSeconds: 10
 failureThreshold: 2
 timeoutSeconds: 5

Parameter

Description

exec.command

The readiness probe command

--allow-api-errors

The flag to allow errors in API instances

--allow-envoy-errors

The flag to allow Envoy configuration errors

--allow-policy-errors

The flag to allow policy configuration errors

initialDelaySeconds

The time in seconds to wait after startup before executing the first readiness probe

periodSeconds

The period in seconds between each readiness probe

failureThreshold

The number of failed readiness probes before the Kubernetes readiness status is set to false

timeoutSeconds

The time in seconds before the readiness probe times out

To modify the default parameters, update your Helm chart. You can modify these parameters during or after the initial installation of the Helm chart. For more information about updating a Helm chart, see Update Pod Settings for a Flex Gateway Deployment Through a Helm Chart.

When using a readiness probe in Kubernetes (or Openshift) and Connected Mode, omit the --wait flag in the Helm chart used to deploy Flex Gateway. Not doing so might result in an installation failure because Helm waits for a readiness state of true. If there’s no API deployed to your Flex Gateway instance, the readiness state is false. This is not an issue with custom readiness probe configurations or gateways that have an API configured.

Configure External Components

Configure your external components to query the readiness state. For example, in AWS virtual machines that run behind a load balancer, configure the readiness query as a health check. To expose the readiness API to external components, apply the following configuration to the Flex Gateway instance:

apiVersion: gateway.mulesoft.com/v1alpha1
kind: Configuration
metadata:
 name: probe
spec:
 probe:
   enabled: true
   port: 3000

Applying the YAML configuration snippet enables the readiness API to respond to GET requests:

http://<flex-gateway-url>:<port-configured>/probes/readiness?allowEnvoyErrors=<true|false>&allowAPIErrors=<true|false>&allowPolicyErrors=<true|false>

Configure Error Tolerance

The default Kubernetes readiness probe is configured to be error-tolerant. If a configuration error occurs (for example, if you misconfigure a policy), the readiness state changes to true after the API instances and policies are deployed. This enables Flex Gateway to scale even though there might be configuration errors.

You can configure the readiness probe to be error-intolerant, meaning that misconfigured gateway instances never reach a readiness state of true.

To configure the Helm chart for error intolerance, remove the following flags from the readiness probe:

  • --allow-api-errors

  • --allow-envoy-errors

  • --allow-policy-errors

In deployments that query the readiness state through HTTP requests, use these query parameters:

  • allowEnvoyErrors=false

  • allowAPIErrors=false

  • allowPolicyErrors=false

For Docker and Linux deployments, Flex Gateway doesn’t include a pre-configured default readiness probe. However, you can run the readiness probe command manually or configure it for automated use with third-party services. Besides the Kubernetes readiness probe, MuleSoft doesn’t provide support for third-party readiness probes.

To expose the readiness API to external components, apply the following configuration to the Flex Gateway instance:

apiVersion: gateway.mulesoft.com/v1alpha1
kind: Configuration
metadata:
 name: probe
spec:
 probe:
   enabled: true
   port: 3000

Readiness is also probed during shutdown, when the gateway sends a Connection: close header. A SIGTERM signal initiates a drain of downstream connections by signaling them to reconnect. The default drain period is 25 seconds, after which Flex Gateway exits gracefully.

You can modify a drain period via the FLEX_SERVICE_ENVOY_DRAIN_TIME environment variable in Docker and Linux, or the gateway.drainSeconds option in a Kubernetes Helm chart.

Flex Gateway is preconfigured for a graceful shutdown of 30 seconds, which is 5 seconds more than the drain period. When increasing the drain period, also increase the shutdown period using TimeoutStopSec in Linux or terminationGracePeriodSeconds in Kubernetes. The shutdown period must exceed the drain period.

The default readiness probe in Kubernetes runs every 10 seconds with a failure threshold of 2, allowing shutdown detection within 20 seconds. This ensures that no new traffic is sent to the instance shutting down. Use similar configurations with readiness probes in other environments like AWS Load Balancer.

Configure a Liveness Probe

To test whether a Flex Gateway instance is operational, use the following CLI command:

flexctl probe --check=liveness

If the Flex Gateway instance is operational, the command returns an exit code of 0. Otherwise, the command returns an exit code of 1.

You can either run the liveness probe command manually, or configure the command to run automatically. By default, Flex Gateway Kubernetes deployments have an automatic liveness probe configured. The default probe periodically runs the liveness probe command and automatically restarts the Flex Gateway pod after a specified number of failures.

Refer to the following for information about configuring liveness probes in specific environments:

  • Kubernetes and OpenShift

  • Docker, Podman, and Linux

By default, Flex Gateway running on Kubernetes includes a liveness probe configured in the Helm chart. The liveness probe automatically runs flexctl probe --check=liveness at an interval of 10 seconds and restarts non-operational pods after 5 failed tests.

The liveness probe is configured by default with the following values:

livenessProbe:
 exec:
   command:
   - flexctl
   - probe
   - --check=liveness
 initialDelaySeconds: 10
 periodSeconds: 10
 failureThreshold: 5
 timeoutSeconds: 1

Parameter

Description

exec.command

The liveness probe command

initialDelaySeconds

The time in seconds to wait after startup before running the first liveness probe

periodSeconds

The period in seconds between each liveness probe

failureThreshold

The number of failed liveness probes before the Kubernetes pod is restarted

timeoutSeconds

The time in seconds before the liveness probe times out

To modify the default parameters, update your Helm chart. You can modify these parameters during or after the initial installation of the Helm chart. For more information about updating a Helm chart, see Update Pod Settings for a Flex Gateway Deployment Through a Helm Chart.

For Docker and Linux deployments, Flex Gateway doesn’t include a pre-configured default liveness probe. However, you can run the liveness probe command manually, or configure it for automated use with third-party services. Besides the Kubernetes liveness probe, MuleSoft doesn’t provide support for third-party liveness probes.

One method of running the liveness probe command with Docker is to configure HEALTHCHECK in your docker run command. For more information, see Docker run HEALTHCHECK.