Skip to main content

Tips and Tricks

A collection of useful one-liners and code snippets.

Delete an RDS database snapshot

$ aws rds delete-db-snapshot --db-snapshot-identifier cloud-platform-2f8c3c1ceaf0a05d-finalsnapshot

Check the expiration date of the SSL certificate for a live domain

$ echo | openssl s_client -connect google.com:443 | openssl x509 -noout -enddate
depth=1 C = US, O = Google Trust Services, CN = Google Internet Authority G3
verify error:num=20:unable to get local issuer certificate
verify return:0
DONE
notAfter=Oct 21 18:23:00 2019 GMT

Run etcdctl on a master node

While SSHed onto a master:

$ sudo docker ps | grep etcd.manager

Pick the kopeio/etcd-manager container you’re interested in (‘main’ or ‘events’) and then docker exec onto it

$ sudo docker exec -it [container id] bash

Now you can run etcdctl like this:

cd /opt/etcd-v3.4.3-linux-amd64

ETCDCTL_API=3 \
  ./etcdctl --key /rootfs/etc/kubernetes/pki/kube-apiserver/etcd-client.key \
  --cert  /rootfs/etc/kubernetes/pki/kube-apiserver/etcd-client.crt \
  --cacert /rootfs/etc/kubernetes/pki/kube-apiserver/etcd-ca.crt  \
  --endpoints "https://etcd-a.internal.live-1.cloud-platform.service.justice.gov.uk:4001,https://etcd-b.internal.live-1.cloud-platform.service.justice.gov.uk:4001,https://etcd-c.internal.live-1.cloud-platform.service.justice.gov.uk:4001" endpoint health

Delete a “stuck” resource

kubectl can freeze with resources in status “Terminating”, these workarounds can force a cleanup:

Identify what resources are still present in a namespace:

ns=my-app ; for api in `kubectl api-resources --verbs=list --namespaced -o name` ; do echo $api ;  kubectl get $api -n $ns ; done

If a regular delete does not remove the resource, try

kubectl delete <resource> --grace-period=0 --force

and/or

kubectl patch <resource> -p '{"metadata":{"finalizers":[]}}' --type=merge

Where “resource” can be namespace, crd, pod etc

Filter namespaces by specific label or annotation

For the label cloud-platform.justice.gov.uk/is-production with value true

kubectl get ns -l=cloud-platform.justice.gov.uk/is-production=true

You can use awk to output specific column

kubectl get namespaces -l=cloud-platform.justice.gov.uk/is-production=true | awk '{print $1,$2}'

To filter namespaces by label and annotation and display as namespace / annotation format

kubectl get namespaces -l=cloud-platform.justice.gov.uk/is-production=true -ojson | jq -r '.items[] | .metadata.name + " / " + .metadata.annotations["cloud-platform.justice.gov.uk/source-code"] '

If you need to install jq, do brew install jq

Find all pods running on a specific worker node

kubectl get pods --all-namespaces -o wide --field-selector spec.nodeName=ip-172-20-80-62.eu-west-2.compute.internal

Count pods running on all nodes

Paste this into the search field on Prometheus:

max by(node) (max by(instance) (kubelet_running_pod_count{job="kubelet",metrics_path="/metrics"}) * on(instance) group_left(node) kubelet_node_name{job="kubelet",metrics_path="/metrics"})

Install tmux and docker on the live-1 bastion host

From here

sudo apt update
sudo apt install tmux
sudo apt install apt-transport-https ca-certificates curl gnupg2 software-properties-common
curl -fsSL https://download.docker.com/linux/debian/gpg | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/debian $(lsb_release -cs) stable"
sudo apt update
apt-cache policy docker-ce
sudo apt install docker-ce
sudo systemctl status docker

Output all records from Route53 as a CSV file

Use this script

Add more RSS feeds to #cloud-platform-rss channel

/feed subscribe <feed address>

There are more slash commands for rss feed which can be used to list, remove and help. Type /feed help in the #cloud-platform-rss` channel to get more details.

To validate your feed url, test using the w3c validator

To get the RSS feed url for github projects, use <github repo url>/releases.atom

To add your first feed to a new slack channel, follow steps provided by slack

Find files which don’t contain a particular string

for file in $(find * -name '*.erb'); do grep -q last_reviewed_on $file || echo $file; done

Get AWS EC2 instance information for a node

aws ec2 describe-instances --filters 'Name=private-dns-name,Values=ip-172-20-127-140.eu-west-2.compute.internal'

Create a one-off job to repeat a cronjob

kubectl -n concourse-main create job ds-update-orphaned-statefiles --from=cronjob/orphaned-terraform-statefiles

Hijack a concourse job container

fly -t manager hijack -j environments-terraform/apply-namespace-changes-live-1

You’ll be presented with a list of containers to choose from.

Help when manually deleting AWS resources

Occasionally you’ll need to manually delete AWS components. A neat trick is to use the AWS Tag Editor.

  • Open the Tag Editor.
  • Select “All Supported Resources” from the “Resource type” drop down.
  • Click “Search Resources”.
  • Use the filter to search for a tag, for example a cluster name: cp-0909-1202.

You’ll be presented with all resources with your tag associated, click the link to navigate to each resource and delete as appropriate.

Find out why your namespace isn’t deleting

kubectl api-resources --verbs=list --namespaced -o name \
| xargs -n 1 kubectl get --show-kind --ignore-not-found -n <namespace-name>

It should show you the reason the API doesn’t want to delete. For example:

Error from server: conversion webhook for acme.cert-manager.io/v1alpha2, Kind=Order failed: Post https://cert-manager-webhook.cert-manager.svc:443/convert?timeout=30s: service "cert-manager-webhook" not found
Error from server: conversion webhook for cert-manager.io/v1alpha2, Kind=CertificateRequest failed: Post https://cert-manager-webhook.cert-manager.svc:443/convert?timeout=30s: service "cert-manager-webhook" not found
Error from server: conversion webhook for cert-manager.io/v1alpha2, Kind=Certificate failed: Post https://cert-manager-webhook.cert-manager.svc:443/convert?timeout=30s: service "cert-manager-webhook" not found
This page was last reviewed on 8 April 2021. It needs to be reviewed again on 8 July 2021 by the page owner #cloud-platform .
This page was set to be reviewed before 8 July 2021 by the page owner #cloud-platform. This might mean the content is out of date.