Skip to main content

Delete Prometheus Metrics

This runbook describes the process for deleting metrics series from Prometheus.

More information in this article.

NB: This procedure has not removed the deleted metrics from the dropdown list on this page. So, I’m not sure it actually works.

1. Identify the metrics you want to delete

  • Use your web browser’s developer tools to inspect the dropdown list from the prometheus UI and copy its source, then save that as a file called metrics-select-box.html

  • Filter the list and convert it to plain text

cat metrics-select-box.html | sed $'s/></>\n </g' | grep jenkins_node_ | sed 's/<.option>//' | sed 's/.*>//' > metrics

This example captures all metrics whose names include the string jenkins_node_

2. Enable the admin interface in Prometheus

kubectl -n monitoring edit prometheus prometheus-operator-prometheus

Under spec add:

enableAdminAPI: true

This will restart prometheus, leading to a 3-5 minute outage.

3. Launch a port-forward pod

kubectl -n monitoring run port-forward-pod \
  --generator=run-pod/v1 \
  --image=ministryofjustice/port-forward \
  --port=9090 \
  --env="REMOTE_HOST=prometheus-operator-prometheus" \
  --env="LOCAL_PORT=9090" \
  --env="REMOTE_PORT=9090"

4. Forward local traffic to Prometheus

kubectl -n monitoring port-forward port-forward-pod 9090:9090

This port-forward will die periodically, so you’ll need to restart it when that happens.

5. Use curl (in another terminal) to hit the API endpoint

curl -v http://localhost:9090/api/v1/query?query=up

If this works, you will see a list of metrics as a JSON document.

You can now delete a single metric like this

curl -v -X POST -g 'http://localhost:9090/api/v1/admin/tsdb/delete_series?match[]={__name__="jenkins_node_artefacts_javaskeleton_build_kaniko_1_rq4tp_0t7pb_1srvx_builds"}'

6. Use this script to delete multiple metrics

Use with care. This will delete all the metrics listed in the file metrics

#!/usr/bin/env ruby

def delete_metric(name)
  return if name.to_s.length == 0
  execute %[curl -v -X POST -g 'http://localhost:9090/api/v1/admin/tsdb/delete_series?match[]={__name__="#{name}"}']
end

def clean_tombstones
  execute %[curl -v -X POST -g 'http://localhost:9090/api/v1/admin/tsdb/clean_tombstones']
end

def execute(cmd)
  puts cmd
  `#{cmd}`
end

metrics = File.readlines("metrics")

metrics.each_slice(1000) do |list|
  list.each { |metric| delete_metric(metric.chomp) }
  clean_tombstones
end

This works in chunks of 1000 metrics, calling the data compaction command after each chunk.

Invoke the script like this:

./delete_metrics.rb 2>&1 | tee delete.log

This will fail after a while, because the port-forward will die. So, use the delete.log to figure out where the last success occurred, and delete down to there in the metrics file

grep -B 6 204 delete.log | grep POST | tail

Repeat until finished.

7. Clean up

  • Disable the admin interface. As per step 1, remove the enableAdminAPI: true flag (this will bounce prometheus, causing an outage of a few minutes)

  • Remove the port-forward pod

kubectl -n monitoring delete pod port-forward-pod
This page was last reviewed on 24 May 2024. It needs to be reviewed again on 24 November 2024 by the page owner #cloud-platform .
This page was set to be reviewed before 24 November 2024 by the page owner #cloud-platform. This might mean the content is out of date.