mimir icon indicating copy to clipboard operation
mimir copied to clipboard

Incorrect value of estimated compaction jobs when out-of-order blocks have special label.

Open pstibrany opened this issue 1 year ago • 2 comments

Describe the bug

Recently introduced metric cortex_bucket_index_estimated_compaction_jobs reports wrong number of compaction jobs when -ingester.out-of-order-blocks-external-label-enabled option (or per-tenant version) is used. Reason is that we don't preserve all block labels in the bucket index, so bucket-index job estimation can't use the same labels as compactor does.

Possible solutions:

  1. Include all block labels in the bucket index.
  2. Use out_of_order flag in bucket index together with per-tenant value of out-of-order-blocks-external-label-enabled to deduce correct labels. This would only work for 1st-level blocks that still have out_of_order flag. This flag is lost after compactions.
  3. Ignore the problem, as metric is just an estimate anyway.

I suggest we implement solution 1.

pstibrany avatar Feb 07 '24 13:02 pstibrany

we don't preserve all block labels in the bucket index

What labels are absent from the bucket index? Is it labels from partial blocks?

seizethedave avatar Mar 22 '24 16:03 seizethedave

What labels are absent from the bucket index? Is it labels from partial blocks?

Well, bucket-index doesn't preserve any block labels, although we do set CompactorShardID field from __compactor_shard_id__ label.

https://github.com/grafana/mimir/blob/fcd9f0300f8cbce54fb9394b1649964c7c634acf/pkg/storage/tsdb/bucketindex/index.go#L91-L92

In theory blocks can have any number of other labels and compactor would only compact blocks with equal labels together. In practice Mimir only uses __compactor_shard_id__ and __out_of_order__, all other labels are deprecated.

https://github.com/grafana/mimir/blob/a42733ca55b5265941bf85c817fa2cb36d504542/pkg/storage/tsdb/config.go#L48-L50

pstibrany avatar Mar 22 '24 16:03 pstibrany