mimir
mimir copied to clipboard
Incorrect value of estimated compaction jobs when out-of-order blocks have special label.
Describe the bug
Recently introduced metric cortex_bucket_index_estimated_compaction_jobs
reports wrong number of compaction jobs when -ingester.out-of-order-blocks-external-label-enabled
option (or per-tenant version) is used. Reason is that we don't preserve all block labels in the bucket index, so bucket-index job estimation can't use the same labels as compactor does.
Possible solutions:
- Include all block labels in the bucket index.
- Use
out_of_order
flag in bucket index together with per-tenant value ofout-of-order-blocks-external-label-enabled
to deduce correct labels. This would only work for 1st-level blocks that still haveout_of_order
flag. This flag is lost after compactions. - Ignore the problem, as metric is just an estimate anyway.
I suggest we implement solution 1.
we don't preserve all block labels in the bucket index
What labels are absent from the bucket index? Is it labels from partial blocks?
What labels are absent from the bucket index? Is it labels from partial blocks?
Well, bucket-index doesn't preserve any block labels, although we do set CompactorShardID
field from __compactor_shard_id__
label.
https://github.com/grafana/mimir/blob/fcd9f0300f8cbce54fb9394b1649964c7c634acf/pkg/storage/tsdb/bucketindex/index.go#L91-L92
In theory blocks can have any number of other labels and compactor would only compact blocks with equal labels together. In practice Mimir only uses __compactor_shard_id__
and __out_of_order__
, all other labels are deprecated.
https://github.com/grafana/mimir/blob/a42733ca55b5265941bf85c817fa2cb36d504542/pkg/storage/tsdb/config.go#L48-L50