mimir
mimir copied to clipboard
Helm <> Jsonnet: Fix read path differences
What this PR does
Draft PR to trigger the CI Helm <> Jsonnet diff
Which issue(s) this PR fixes or relates to
Fixes #2205
Checklist
- [ ] Tests updated
- [ ] Documentation added
- [ ]
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]
@dimitarvdimitrov: One of the differences is querier.max-concurrent
set to 16 in Helm vs 20 (the default) in Jsonnet.
I see you added this in #2655. I see the comment
# With query sharding we run more but smaller queries. We must strike a balance # which allows us to process more sharded queries in parallel when requested, but not overload # queriers during non-sharded queries.
Is there any more context on why 20 isn't a good value here?
@pstibrany One difference is store-gateway.sharding-ring.wait-stability-min-duration
. It defaults to 0
, but is set to 1m
by jsonnet. I can see that 1m
used to be the default until your PR #976 moved it from a default in Go to a default in Jsonnet.
Can you explain why? I'd like to either:
- Revert it to being a default in the Go code and remove it from Jsonnet
- Add the 1m setting to Helm if it's generally useful
- Continue ignoring it if somehow Jsonnet is special, although this seems really unlikely
@pracucci One difference is blocks-storage.bucket-store.max-chunk-pool-bytes
. It defaults to 2GB in Helm and is overridden to 12GB in Jsonnet after your PR here: https://github.com/grafana/cortex-jsonnet/pull/322/files
Any reason I shouldn't port this into Mimir as a default in the Go code?
blocks-storage.bucket-store.index-cache.memcached.max-item-size
is 5MB in Jsonnet, 15MB in Helm. @ortuman , you most recently edited the Helm side in https://github.com/grafana/mimir/issues/2064, any idea why it's different or which one should be preferred?
@dimitarvdimitrov: One of the differences is
querier.max-concurrent
set to 16 in Helm vs 20 (the default) in Jsonnet.I see you added this in #2655. I see the comment
# With query sharding we run more but smaller queries. We must strike a balance # which allows us to process more sharded queries in parallel when requested, but not overload # queriers during non-sharded queries.
Is there any more context on why 20 isn't a good value here?
Helm uses 16 when query sharding is enabled. Jsonnet uses the same value when query sharding is enabled. I think the difference is that jsonnet doesn't enable query sharding by default.
https://github.com/grafana/mimir/blob/74b03a6887a2354da699aef3a2353cd04de35b8b/operations/mimir/query-sharding.libsonnet#L67-L72
@dimitarvdimitrov Thanks, I enabled it in the test now. More questions:
query-scheduler.max-outstanding-requests-per-tenant
is set to 1600
in helm, but 800
in jsonnet when query sharding is enabled. Any idea which is more appropriate?
querier.max-query-parallelism
is set to 224 in Helm, but 240 in Jsonnet. (14d vs 15d queries are targetted). Any reason Helm shouldn't use a 15d calculation and match?
@dimitarvdimitrov Thanks, I enabled it in the test now. More questions:
query-scheduler.max-outstanding-requests-per-tenant
is set to1600
in helm, but800
in jsonnet when query sharding is enabled. Any idea which is more appropriate?
I think I just didn't check the jsonnet for this one.
querier.max-query-parallelism
is set to 224 in Helm, but 240 in Jsonnet. (14d vs 15d queries are targetted). Any reason Helm shouldn't use a 15d calculation and match?
I thought that 15d was not a time range that people usually run with, since Grafana offers 7d, 14d, 30d... ranges. But maybe I had overlooked some reasons for it being 15d in jsonnet.
I've opened a PR against this one to correct these numbers and set them to what jsonnet sets them to https://github.com/grafana/mimir/pull/3013. Thanks for bringing it up.
The CHANGELOG has just been cut to prepare for the next Mimir release. Please rebase main
and eventually move the CHANGELOG entry added / updated in this PR to the top of the CHANGELOG document. Thanks!