unbound icon indicating copy to clipboard operation
unbound copied to clipboard

[Unbound 1.13.2] NXDOMAIN for queries to a stub-zone hosted in BIND

Open callum-key opened this issue 2 years ago • 5 comments

This is a follow up to ticket: https://github.com/NLnetLabs/unbound/issues/667

Describe the bug We have began experiencing the same problem as mentioned in the above ticket, where for a particular "stub-zone" domain ending in ".cso", we are recieving "NXDOMAIN" responsed after roughly 5-10 minutes of restarting the Unbound service.

This came back after about 6 months of working fine once we upgraded to 1.13.2, and seems to only affect one of our servers, which we re-configured by scratch but still encounter the same problem (using same IP address for the Unbound service.)

To reproduce Unsure how to reproduce as it only affects one server, which is set up with the same configuration as our other two servers; sorry about this.

Expected behavior We expect that requests to this domain, hosted in BIND on the same server, would respond with the appropriate record value rather than "NXDOMAIN".

System:

  • Unbound version: 1.13.2
  • OS: Slackware 14.2
  • unbound -V output: Version 1.13.2

callum-key avatar Sep 11 '23 14:09 callum-key

Does the setting harden-below-nxdomain: no fix the issue? If so there may be an issue with that option. When the domain is DNSSEC signed, perhaps it also makes a difference. Perhaps with aggressive-nsec: no the issue can be fixed? Perhaps the domain creates wrong nsec records, or wrongly nsec records are used above the stub definition.

It would be useful to somehow reproduce it and have log, at verbosity 4, of what is going wrong. So that it can be fixed, perhaps by restricting nxdomain or nsec synthesis at some point.

wcawijngaards avatar Sep 11 '23 15:09 wcawijngaards

Hi,

Thank you for the very quick response! Please allow me to test these and see if I can get any information. It may be difficult to enable verbosity 4 logging as our servers are highly customised, however I'll see what information I can feed back.

callum-key avatar Sep 11 '23 15:09 callum-key

Hi there,

So I tested the harden-below-nxdomain: no option under the server: block, and for the last 5 hours things have been okay. I'm going to keep monitoring this using dig for the next day or so and I'll keep you updated with my findings.

Once again, thank you for your prompt help!

callum-key avatar Sep 13 '23 13:09 callum-key

Hi,

Thank for you for waiting for my feedback. Enabling the harden-below-nxdomain: no option under the server: block seems to have done the trick, as the resolution of this stub domain has been working consistently since my last update.

Can you forsee any issues ocurring with having this setting enabled long-term? Furthermore, would you like to see the Unbound configuration file(s) that we are using?

callum-key avatar Sep 19 '23 07:09 callum-key

For the long term, the solution does not really make that much trouble. It would be better to leave it at default, because the standards say it should be and it would stop certain query patterns, by allowing cached responses. But without it probably is fine for resolution of queries.

Although it fixes it, it does not reveal what the problem was. Some nxdomain content appeared and was used to create nxdomain answers. With the new setting that is over, but it is not sure if the problem was due to the content of the servers or due to software issues for Unbound. It is nice that there is a fix, so that can be used and I guess that concludes the issue here. If more information becomes available, specifically about what goes wrong, perhaps there can be more fixups.

wcawijngaards avatar Sep 19 '23 08:09 wcawijngaards