redpanda
redpanda copied to clipboard
Refresh health monitor on leader
Cover letter
Changed the way how cluster health is refreshed on the raft0
leader
node. Currently the refresh will be triggered if controller leader
health metadata is stale. Previously the refresh was timer base it might
have happened that when leadership was change requester accessed a stale
health metadata which were much older than max_metadata_age
.
Now every time the health report is requested we check the metadata age and if metadata is stale we dispatch an appropriate refresh request either contacting leader or gathering information from the cluster.
Fixes #ISSUE-NUMBER, Fixes #ISSUE-NUMBER, ...
Backport Required
- [ ] not a bug fix
- [ ] papercut/not impactful enough to backport
- [ ] v22.2.x
- [ ] v22.1.x
- [ ] v21.11.x
UX changes
Describe in plain language how this PR affects an end-user. What topic flags, configuration flags, command line flags, deprecation policies etc are added/changed.
Release notes
it seems like the timer could have been left in place to do optimistic fetching of metadata rather than in-lining the cost of refreshing metadata at the time of request. but maybe it's not a big deal either way?
I thought the same, but looks like metadata dissemination service (among others) will check the health report every 3 seconds, so the effect is almost the same as a dedicated timer :)
/backport v22.2.x
Failed to run cherry-pick command. I executed the below command:
git cherry-pick -x 1df9582e3326b2c537fd4b2e5cdebdc3344816f5 1a7b82901364bd4831fe6499cc3ffaff0c70637c 5a4741ab648a02ffcaf54ab8534a2daf60b2525c 4d94f1ff4a0a5553c4dd4937c0128e165a46ae4e 20ed06bd0b7ddbedc66e8ee1256814a4108bca6d c6e97b35607d5615d1bc47a751c95706c1d34d76 4257f3672be8c8abf70aa0510b991ea5db1f1faf
Please don't leave the release notes empty.