unbound
unbound copied to clipboard
[FR] configure auth server down detection
This FR is the result of a question I posted on the user ML: https://lists.nlnetlabs.nl/pipermail/unbound-users/2023-February/008025.html
Current behavior
Currently when an authoritative server does not behave correctly and silently discards specific queries, e.g. in my case DS queries, unbound will mark the server as not responding, even if other queries, e.g. for A or AAAA would be answered.
This at some point leads to all queries for the affected zones to be answered by SERVFAIL, because unbound will try all authoritative servers for the DS query and mark all of them as down in the infra cache.
Describe the desired feature
I would love a feature where I can set an option either for specific zones or names of nameservers that I know are broken, to ignore specific failing queries, e.g. ignore failing DS queries for a given nameserver name or be able to prevent these queries being sent. E.g. similar like an RPZ, but matching NSDNAME and specific RR Types.
Potential use-case
This feature would allow resolving names in zones hosted on broken nameservers where reaching out to the operators of these servers does not work or an intermediate solution is needed until contact can be established.
Explaining customers why nameservers are returning SERVFAIL but "everything works with google" is not always possible and having this sort of work around in the toolbelt can help in these situations.
From a more general view making the outage detection more robust could also help in situations with high(er) packet loss, avoiding false positives. Not sure, however, if this currently is an issue.
While I don't like the message of supporting broken implementations, and thus suggesting making this a configuration option and not default behavior, I still see this inline with "be conservative in what you send, be liberal in what you accept".
It would however solve a very nasty problem for me right now.
There is a standards document about this problem. https://datatracker.ietf.org/doc/html/rfc8906 RFC 8906, "A Common Operational Problem in DNS Servers: Failure to Communicate".
Realistically, the upstream server should respond, with an error, if it does not want to provide responses to that query type.
With local-zone: example.com typetransparent
and local-data: "www.example.com DS \# 00"
unbound stops clients from making the type DS query that you want stopped.
The RPZ ignores type DS when present in the RPZ, so it cannot interact with the query based on its type. But it could stop the domain name entirely.