Skip to main content

Recycle-node

The recycle-node pipeline runs every day on the live cluster, it executes the cloud-platform cli command to replace the oldest worker node by:

  • Cordoning the oldest node
  • Draining the node
  • Force drain to delete the remaining “stuck pods”
  • Terminating the node
  • Ensure the cluster is healthy with the same set of nodes before the process started

To recycle to oldest node on the cluster in your current context:

cloud-platform cluster recycle-node --oldest

To recycle a given node on the cluster in your current context

cloud-platform cluster recycle-node --name ip-XXX.XX.XX.XX.eu-west-2.compute.internal

To recycle a node previously cordoned but not drained (this is usually the result of an interruption of failure)

example error:

FATA[0000] node ip-172-20-53-167.eu-west-2.compute.internal is already cordoned, abort  subcommand=recycle-node
cloud-platform cluster recycle-node --ignore-label

The other optional flags are

      --aws-access-key string   aws access key to use
      --aws-profile string      aws profile to use
      --aws-region string       aws region to use (default "eu-west-2")
      --aws-secret-key string   aws secret to use
      --debug                   enable debug logging
  -f, --force                   force the pods to drain (default true)
  -h, --help                    help for recycle-node
  -i, --ignore-label            whether to ignore the labels on the resource
      --kubecfg string          path to kubeconfig file (default "/Users/jasonbirchall/.kube/config")
  -n, --name string             name of the resource to recycle
      --oldest                  whether to recycle the oldest node
  -t, --timeout int             amount of time to wait for the drain command to complete (default 360)

For other flags, refer the help with --help flag.

This page was last reviewed on 9 January 2024. It needs to be reviewed again on 9 April 2024 by the page owner #cloud-platform .
This page was set to be reviewed before 9 April 2024 by the page owner #cloud-platform. This might mean the content is out of date.