Upgrade EKS addons
We have 3 addons managed through cloud-platform-terraform-eks-add-ons module.
Before every EKS major version upgrade, check and upgrade if the addons versions don’t match the EKS major version the cluster is currently on.
After every EKS major versions upgrade, check and upgrade if the addons don’t match the EKS major version the cluster you just upgraded to.
The following addons are managed through cloud-platform-terraform-eks-add-ons module.
Listing available EKS upgrades
eksctl Install
brew install eksctl
Run the following command to get a list of supported addon version for the Kubernetes version.
eksctl utils describe-addon-versions --kubernetes-version [k8s-version] --name [addon name: kube-proxy/vpc-cni/coredns] | grep AddonVersion
Error: error creating EKS Add-On (cluster-name:addon-name): InvalidParameterException: Addon version specified is not supported
If you come across this error, it is possible that the aws ui has incorrectly informed you about the supported version. To figure out which version is supported run the following command:
aws eks describe-addon-versions --kubernetes-version=$K8S_VERSION | jq '.addons[] | select(.addonName==$ADDON_NAME) | .addonVersions[] | select(.compatibilities[] | .defaultVersion==true)'
this will pull out the default compatible value for the k8s version for your addon.
Preparing for upgrade
Check the changelog for each of the addons and determine if there are any breaking changes.
Create a thread in #cloud-platform notifying the team that upgrades are starting and the pipelines will be paused.
Starting the upgrade
- Bump the version number in cloud-platform-terraform-eks-add-ons
- Commit changes on a new branch and create a pull request
- Request review from someone on the team
- Merge pull request and create a new release through the Github UI
- Bump the version number of the cloud-platform-terraform-eks-add-ons in cloud-platform-infrastructure
- Commit changes on a new branch and create a pull request
- Request review from someone on the team
- Check the terraform plan in concourse and pause the following pipelines:
- bootstrap
- infrastructure-live
- infrastructure-manager
- infrastructure-live-2
- Create an output of the configuration of a pod before the upgrade.
kubectl -n kube-system get pod $addon -oyaml
there is also a helper script. - Merge the pull request
- Unpause an infrastructure pipeline and wait for it to complete
- While running:
- Keep an eye on pods recycling
watch -n 1 "kubectl -n kube-system get pods"
- Keep an eye on events
watch -n 1 "kubectl -n kube-system get events"
- Keep an eye on pods recycling
- Run the reporting pipeline on the infrastructure environment
- If everything is green repeat steps 11-14 on each environment.
Finish the upgrade
Finish up communications and close the thread.
Finally, ensure that all add-on underlying image versions are updated in the Container Images used by Cluster Components runbook