promql-engine MergeSelectsOptimizer should consider start, end or range of the selector

Issue

Consider the following query:

sum(rate(metric{node="node1"}[5m]))/count(max_over_time(metric{node="node"}[6m]))

The MergeSelectsOptimizer will try to optimize the LHS to matchers="{__name__='metric'}" and filters="{node='node1'}"

However, since the range of the selectors are different they still end up being hashed to different values in the SelectorPool. The engine will end up making two selects to storage for {__name__='metric'}.

But in this case, it would have been more optimal if the LHS wasn't optimized.

Should the MergeSelectsOptimizer also consider the start, end and range of the selector when finding a replacement?

Jan 31 '25 05:01 harry671003

Great finding! I wonder - couldn't we handle that in the operator by fetching once, filtering and then just stopping the left one a bit earlier - but I guess with offset it could be tricky to decide if it's cheaper to fetch one big vector for big range or multiple smaller ones - maybe best is to just not optimize here

Jan 31 '25 08:01 MichaHoffmann

Not only just start, end and range can cause the problem. There are other fields that are also used in caching https://github.com/thanos-io/promql-engine/blob/main/storage/prometheus/pool.go#L55.

For example, different hints.Func and hints.Grouping would cause different cache key even if the select is the same. This makes it hard to support projection in merge select optimizer if func is different.

And there is no easy way to detect these issue in MergeSelectsOptimizer as select hints are generated at execution time but logical plan needs to be created before that so there seems no way to access hints.

Apr 23 '25 06:04 yeya24