Update Nodes

Before You Begin

  • Set up and test the backup procedure on all clusters.

  • Perform a backup before attempting node maintenance to preserve data in the event of a failure.

Cluster and Application Updates

You must perform the following steps on one Anypoint Platform Private Cloud Edition node at a time to minimize or eliminate component breakage during the update process. You must complete the entire process on each node before the next node is addressed. After each node is updated, verify the health status of all cluster components and applications before proceeding with the next node.

Verify Cluster Health Status

Verify that the cluster is healthy before starting the update process. Otherwise, many of the assumptions made in this process are not valid.

  1. From a gravity shell, verify the output from the following commands:

    gravity status
    etcdctl cluster-health

    If the output from the previous commands mentions a degraded state or errors, fix all issues before proceeding.

  2. Verify that all pods are running:

    kubectl get pod -o wide --all-namespaces | grep -vi Running

    If the output from the previous command lists any pod not in "Running" state, then fix all issues before proceeding.

Install Updates

Select a node on which to perform the following steps:

  1. Retrieve the name of the target node:

    kubectl get nodes
  2. Drain the node:

    kubectl drain --delete-local-data --ignore-daemonsets <node_name>
  3. After the node is drained, identify the gravity unit name:

    sudo systemctl list-units --plain | awk '/planet-master/{print $1}'

    The previous command outputs a systemd unit name, which is used as the $unit_name value in the following step.

  4. Stop the gravity unit:

    systemctl stop $unit_name
    systemctl status $unit_name
  5. After the status subcommand indicates that the gravity unit is stopped, install the updates and reboot the server.

Perform Post-update Verification

  1. Verify that all systems correctly restarted:

    systemctl status --failed

    The previous command should return zero units, which means that your system restart was successful and all units started. If any units failed to start, fix all issues before proceeding.

  2. Check the gravity systemd units:

    systemctl list-units '*gravity__gravitational*'
  3. Check for errors in the logs by running journalctl.

  4. If everything started correctly, run gravity enter and verify the results.

  5. Check the cluster status again:

    gravity status
    etcdctl cluster-health
    gravity get pod -o wide--all-namespaces | grep -vi Running

Rejoin the Node

  1. Check the gravity Planet master and teleport status:

    systemctl list-units '*gravity__gravitational*'
  2. If the status of the gravity service is not loaded, active, or running, then you must enable it:

    systemctl enable --now <name_of_the_gravity_unit>
  3. Uncordon the node to set the target node to again schedule pods. This can take a few minutes to complete.

    kubectl uncordon <node_name>
  4. Verify cluster status:

    gravity status

    This command should display a healthy status for all nodes of the cluster. After you verify that the node has rejoined the cluster and is in healthy status, repeat the process for the next node in the cluster until all nodes are updated.

Was this article helpful?

💙 Thanks for your feedback!

Edit on GitHub
Give us your feedback!
We want to build the best documentation experience for you!
Help us improve with your feedback.
Take the survey!