redpanda icon indicating copy to clipboard operation
redpanda copied to clipboard

Refresh health monitor on leader

Open mmaslankaprv opened this issue 2 years ago • 0 comments

Cover letter

Changed the way how cluster health is refreshed on the raft0 leader node. Currently the refresh will be triggered if controller leader health metadata is stale. Previously the refresh was timer base it might have happened that when leadership was change requester accessed a stale health metadata which were much older than max_metadata_age.

Now every time the health report is requested we check the metadata age and if metadata is stale we dispatch an appropriate refresh request either contacting leader or gathering information from the cluster.

Fixes #ISSUE-NUMBER, Fixes #ISSUE-NUMBER, ...

Backport Required

  • [ ] not a bug fix
  • [ ] papercut/not impactful enough to backport
  • [ ] v22.2.x
  • [ ] v22.1.x
  • [ ] v21.11.x

UX changes

Describe in plain language how this PR affects an end-user. What topic flags, configuration flags, command line flags, deprecation policies etc are added/changed.

Release notes

mmaslankaprv avatar Aug 09 '22 16:08 mmaslankaprv

it seems like the timer could have been left in place to do optimistic fetching of metadata rather than in-lining the cost of refreshing metadata at the time of request. but maybe it's not a big deal either way?

I thought the same, but looks like metadata dissemination service (among others) will check the health report every 3 seconds, so the effect is almost the same as a dedicated timer :)

ztlpn avatar Aug 10 '22 12:08 ztlpn

/backport v22.2.x

mmaslankaprv avatar Aug 11 '22 05:08 mmaslankaprv

Failed to run cherry-pick command. I executed the below command:

git cherry-pick -x 1df9582e3326b2c537fd4b2e5cdebdc3344816f5 1a7b82901364bd4831fe6499cc3ffaff0c70637c 5a4741ab648a02ffcaf54ab8534a2daf60b2525c 4d94f1ff4a0a5553c4dd4937c0128e165a46ae4e 20ed06bd0b7ddbedc66e8ee1256814a4108bca6d c6e97b35607d5615d1bc47a751c95706c1d34d76 4257f3672be8c8abf70aa0510b991ea5db1f1faf

Workflow run logs.

vbotbuildovich avatar Aug 11 '22 05:08 vbotbuildovich

Please don't leave the release notes empty.

BenPope avatar Aug 11 '22 15:08 BenPope