Skip to main content

How to Investigate External-Dns Errors

When there are errors in external-dns logs, “ErrorsInExternalDNS” alert sent to the low priority slack channel.

Troubleshooting

If we see an ErrorsInExternalDNS alert in low-priority-alerts, this is usually due to a external-dns have an issue to write records to Route-53 for a particual hosted zones. You can see errors from the external-dns pod by running:

kubectl logs -n kube-system external-dns-<pod-id> -f

Invalid Change Batch

level=error msg="InvalidChangeBatch: [RDATA character limit of 32000 exceeded.]"

level=error msg="InvalidChangeBatch: [RRSet with DNS name _external_dns.development.moj.service.justice.gov.uk., type TXT cannot be created as other RRSets exist with the same name and type.,
RRSet with DNS name development.moj.service.justice.gov.uk., type A cannot be created as other RRSets exist with the same name and type.]"

level=error msg="InvalidChangeBatch: [Tried to delete resource record set [name='_external_dns.test.live.cloud-platform.service.justice.gov.uk.', type='TXT', set-identifier='test-live-1'] but it was not found,
Tried to delete resource record set [name='test.live.cloud-platform.service.justice.gov.uk.', type='A', set-identifier='test-live-1'] but it was not found]

level=error msg="InvalidChangeBatch: [Tried to create resource record set [name='cluster-cname-thisingress.example.', type='TXT'] but it already exists, Tried to create
resource record set [name='cluster-thisingress.example.', type='TXT'] but it already exists]

level=error msg="failed to submit all changes for the following zones: [/hostedzone/ABCDEFGHIJKLMN /hostedzone/NMLKJIHGFEDCBA]"

Identify the ingress causing the error in question and fix the ingress depending upon the error message. We may need to inform users to update their ingress causing the error.

Follow this troubleshoot guide for common error messages to determine the error’s cause and how to troubleshoot it.

Once the error is cleared, you will see info logs similar to:

level=info msg="Applying provider record filter for domains: [dev.example. .dev.example.]"
level=info msg="Desired change: CREATE cluster-a-new.dev.example TXT [Id: /hostedzone/ZZZ]"
level=info msg="All records are already up to date"

Rate Limited / Throttled

level=error msg="records retrieval failed: failed to list hosted zones: Throttling: Rate exceeded\n\tstatus code: 400, request id: 9df2a-7blah"
level=error msg="failed to list resource records sets for zone /hostedzone/BLAH_MOX: Throttling: Rate exceeded\n\tstatus code: 400, request id: 0-9216-435fblah"

There isn’t much we can do about being rate limited, acknowledge the alert.

This page was last reviewed on 18 April 2024. It needs to be reviewed again on 18 October 2024 by the page owner #cloud-platform .
This page was set to be reviewed before 18 October 2024 by the page owner #cloud-platform. This might mean the content is out of date.