mimir icon indicating copy to clipboard operation
mimir copied to clipboard

Reduce default distributor - ingester push timeout to 2s

Open 56quarters opened this issue 3 years ago • 3 comments
trafficstars

Signed-off-by: Nick Pillitteri [email protected]

What this PR does

Reduce the default timeout to 2s from 20s since some resources from the request to each ingester are held in memory of the distributor until the ingester responds or the request times out. This reduces distributor memory usage during ingester crashes.

Which issue(s) this PR fixes or relates to

Fixes #2727

Checklist

  • [ ] Tests updated
  • [X] Documentation added
  • [X] CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

56quarters avatar Aug 15 '22 20:08 56quarters

I think it would make sense to also update the default of -distributor.forwarding.request-timeout to the same value:

https://github.com/grafana/mimir/blob/62e2f933b8d07c0dbd668634d307465cafa2b311/pkg/distributor/forwarding/config.go#L21

Otherwise requests that are being forwarded to a slow forwarding target could lead to requests being held in memory for a duration longer than the value of -distributor.remote-timeout.

replay avatar Aug 15 '22 21:08 replay

I think we should merge this change only after it has been dogfooded at Grafana Labs (aka: running in all prod envs). We're used to change defaults only after that.

pracucci avatar Aug 16 '22 06:08 pracucci

I think we should merge this change only after it has been dogfooded at Grafana Labs (aka: running in all prod envs). We're used to change defaults only after that.

Sounds good. I'll test in dev/ops and add something to the next release jsonnet so that the change can go out a week after the new metrics for chunk-deduplication #2713.

56quarters avatar Aug 16 '22 13:08 56quarters

thanks for doing this in Mimir so we don't have to deal with keeping sync in jsonnet + helm!

krajorama avatar Aug 30 '22 11:08 krajorama