unbound icon indicating copy to clipboard operation
unbound copied to clipboard

add keep-cache option to "unbound-control reload" to keep caches

Open jinmeiib opened this issue 3 years ago • 3 comments

This PR implements a new option +keep-cache to unbount-control. It allows unbound to keep the RRset and message caches on reloading the configuration.

The main idea is to move the ownership of the per-worker alloc_caches from workers to "daemon" and keep using them when it's deemed to be feasible. In my understanding this would (almost) eliminate the first reason for clearing the caches as commented in daemon.c:daemon_cleanup:

	/* clean up caches because
	 * a) RRset IDs will be recycled after a reload, causing collisions
	 * b) validation config can change, thus rrset, msg, keycache clear */

The second reason is still an issue. And, there are other config changes that would make the reuse of caches impossible or infeasible:

  • changing the number of threads
  • changing size of rrset or message cache (then the corresponding cache is re-created and old cache entries are invalidated anyway)
  • changing the forwarder address
  • probably more

This PR handles the first two cases explicitly (see the change and comment in daemon_apply_cfg), but leaves the consideration on other cases as the operator's responsibility. That's why keeping the cache is optional and non-default.

I have an automated test that confirms the change, but its style is different from other unbound tests, so it's not included in this PR. I'm also aware that some documentation update will be necessary if (the idea of) this PR is adopted, but I'd first like to get feedback on the concept and the current code that implements it.

I ran both make test and make longtest to confirm the change of this PR won't introduce regression. In the latter I saw occaisional failure of tcp_req_order and ssl_req_order, but they also fail in the master branch, so I don't think the failure is due to this PR.

I'm looking forward to any feedback.

jinmeiib avatar Nov 11 '21 18:11 jinmeiib

@wcawijngaards could you take care?

UladzimirTrehubenka avatar Jun 08 '22 09:06 UladzimirTrehubenka

This is in my todo list but because of more pressing stuff it got jostled away. I will have a look this/next week.

gthess avatar Jun 08 '22 10:06 gthess

I stumbled on this PR by accident. But I very much support this option. In my setup I tried to solve this by using systemd jobs tied to unbound.service that do dump_cache, load_cache respectively. I'd be willing to help with testing and/or documentation, if needed.

penguinpee avatar Jun 27 '22 12:06 penguinpee

Almost year for code review =(

UladzimirTrehubenka avatar Oct 06 '22 07:10 UladzimirTrehubenka

I would like to have included this on the 1.17.0 release, but time was an issue :( The PR for me is ready except for renaming the command from reload +keep-cache to reload_keep_cache, and introducing some test cases. @jinmeiib, how much do you want to be involved with these? I could take over but I would appreciate if you could share your current test cases and I could try incorporating them in Unbound's testing directories.

gthess avatar Oct 14 '22 19:10 gthess

@gthess I'm attaching the example test tpkg to this comment. infoblox_unbound_reload.tar.gz

jinmeiib avatar Nov 28 '22 14:11 jinmeiib

Perfect, thanks!

gthess avatar Nov 28 '22 14:11 gthess