dns icon indicating copy to clipboard operation
dns copied to clipboard

[node-local-dns] Query loss

Open johnny550 opened this issue 1 year ago • 2 comments

Hi. Got a question regarding the local DNS cache. Multiple queries are sent (about 6Mil) with around 67k/sec QPS. As a failure test, I am killing the node local DNS pod on the same node. Turns out I end with around 0.01~0.02% of queries getting lost.

My assumption is the queue of queries is not drained before the iptables rules are deleted & the node local DNS pod is effectively killed. Thus leading to some queries getting dropped. Has anybody come across such issue, or know how to remediate to it?

Should I expect the query loss? On one side, I am inclined to do so because a network hop is still being removed mid test, but on the other hand, I do not expect the node local DNS to drop queries since it just intercepts those directed towards the CoreDNS pods. So removing the node local pods, gracefully as I am here, should maybe not interfere in the query handling?

Env: OS: Fedora CoreOS 38 k8s: v1.28.2 CoreDNS: v1.10.1 local dns: 1.22.23

More about the setup_error metrics

Screenshot 2023-12-01 at 16 09 06

Appreciate you taking the time to read this. Any input will highly be valued.

Thank you

johnny550 avatar Dec 01 '23 07:12 johnny550