helm-charts
helm-charts copied to clipboard
[tempo-distributed] Allow higher count of unavailable ingester replicas in pdb
Resolves https://github.com/grafana/helm-charts/issues/1653
Happy to make this more conservative, just interested in increasing the count from 1
With RF3 the system can tolerate being down 1 ingester and still accept writes and return reads. Now, depending on how you do your rollouts you could do more than one at once, but it would potentially increase load on the remaining ingesters quite a bit.
@joe-elliott I have a lot more than 3 ingester replicas in my ring. My understanding thus is that I can afford to lose more than one and the ring can still service read/writes.
It's occurred to me now that my default setting here isn't accurate for the default values of RF and replicas :D
What I'm suggesting is that maxUnavailable should be replicas - (floor(rep. factor / 2) + 1)
. This covers the default case by making it maxUnavailable: 1 but also covers my situation (and probably others like me) who have more than 3 ingester replicas and can afford to lose more than one at a time.