helm-charts [tempo-distributed] Allow higher count of unavailable ingester replicas in pdb

[tempo-distributed] Allow higher count of unavailable ingester replicas in pdb

Open AlexDCraig opened this issue 2 years ago • 4 comments

Resolves https://github.com/grafana/helm-charts/issues/1653

Happy to make this more conservative, just interested in increasing the count from 1

Aug 02 '22 17:08 AlexDCraig

All committers have signed the CLA.

Aug 02 '22 17:08 CLAassistant

With RF3 the system can tolerate being down 1 ingester and still accept writes and return reads. Now, depending on how you do your rollouts you could do more than one at once, but it would potentially increase load on the remaining ingesters quite a bit.

Aug 02 '22 18:08 joe-elliott

@joe-elliott I have a lot more than 3 ingester replicas in my ring. My understanding thus is that I can afford to lose more than one and the ring can still service read/writes.

Aug 02 '22 18:08 AlexDCraig

It's occurred to me now that my default setting here isn't accurate for the default values of RF and replicas :D

What I'm suggesting is that maxUnavailable should be replicas - (floor(rep. factor / 2) + 1). This covers the default case by making it maxUnavailable: 1 but also covers my situation (and probably others like me) who have more than 3 ingester replicas and can afford to lose more than one at a time.

Aug 02 '22 18:08 AlexDCraig

helm-charts helm-charts copied to clipboard

[tempo-distributed] Allow higher count of unavailable ingester replicas in pdb

helm-charts
helm-charts copied to clipboard