Nick Pillitteri
Nick Pillitteri
Store-gateways don't _load_ the TSDB index-header into memory until needed when lazy loading is enabled, but they still must download the index-header (a subset of the TSDB index) to local...
There are different things going on here. "Loading a block" involves several different pieces of work happening. The index-header is a part of that. Lazy loading controls whether the index-header...
> Compactor starts uploading blocks after compaction and creates bucket-index.json.gz but is incomplete due s3-multipart upload, at the same time, store-gateway reads the incomplete block and tries to unzip it...
One example of the type of queries we've seen cause issues that used all store-gateway memory: label values with very broad matchers: `{service_name="a", span_kind="b", resource_name=~".*$res.*", __name__="duration_seconds_bucket"}`. We don't stream label...
> is it the actual strings for the response that take up memory? I'm not sure, the traces and profiles from the originating incident are gone 😢. I'm going to...
cc @grafana/mimir-ruler-and-alertmanager-maintainers
Mimir Alertmanager doesn't allow any configuration that reads files since this is a security hole (the Alertmanager shouldn't be reading arbitrary files on disk) and doesn't make sense in a...
another instance https://github.com/grafana/mimir/actions/runs/11040649844/job/30669127966?pr=9413 ``` --- FAIL: TestSchedulerProcessor_processQueriesOnSingleStream (1.56s) --- FAIL: TestSchedulerProcessor_processQueriesOnSingleStream/should_not_cancel_query_execution_if_scheduler_client_returns_a_non-cancellation_error (0.51s) scheduler_processor_test.go:255: Error Trace: /__w/mimir/mimir/pkg/querier/worker/scheduler_processor_test.go:255 Error: Not equal: expected: 2 actual : 1 Test: TestSchedulerProcessor_processQueriesOnSingleStream/should_not_cancel_query_execution_if_scheduler_client_returns_a_non-cancellation_error Messages: Expected number of...
Mimir "zones" don't need to correspond to cloud provider availability zones. They can be used when running in a single cloud provider AZ which is indeed how we ran Mimir...
This is a very good point but I worry about the disruption that changing this default would cause. I'm not sure how to mitigate that besides documentation and release notes...