Add projection pushdown optimizer
Fixes #496
This PR adds an implementation of Projection Pushdown optimizer which checks the expression and injects projection to vector selector accordingly.
Note:
- It supports projection pushdown for aggregation.
- It supports projection pushdown for one to one binary expression. Also support one-to-many and many-to-one binary expressions mainly on the low cardinality side. Thanks for the inspiration from https://github.com/thanos-io/promql-engine/pull/550
- It supports functions like label_replace and label_join. And other functions such as
scalar,absentandabsent_over_timeby inspiration from https://github.com/thanos-io/promql-engine/pull/550 - Changes made to skip merging select if the vector selector has projection. For details of why we don't do a full support for merge projection, see https://github.com/thanos-io/promql-engine/pull/549#issuecomment-2823226765
- Storage layer needs to return a label of series hash for proper deduplication.
It is able to pass promqlsmith based fuzz test locally by using a mock implementation of projection querier.
This is unrelated to this change but seems a valid data race.
WARNING: DATA RACE
Write at 0x00c000bb4630 by goroutine 23442:
runtime.mapassign()
/opt/hostedtoolcache/go/1.24.0/x64/src/internal/runtime/maps/runtime_swiss.go:191 +0x0
github.com/prometheus/prometheus/util/annotations.(*Annotations).Add()
/home/runner/go/pkg/mod/github.com/prometheus/[email protected]/util/annotations/annotations.go:45 +0x169
github.com/thanos-io/promql-engine/execution/warnings.(*warnings).add()
/home/runner/work/promql-engine/promql-engine/execution/warnings/context.go:29 +0x9d
github.com/thanos-io/promql-engine/execution/warnings.AddToContext()
/home/runner/work/promql-engine/promql-engine/execution/warnings/context.go:53 +0x6a
github.com/thanos-io/promql-engine/storage/prometheus.(*matrixSelector).Next()
/home/runner/work/promql-engine/promql-engine/storage/prometheus/matrix_selector.go:164 +0x3aa
github.com/thanos-io/promql-engine/execution/exchange.(*concurrencyOperator).pull()
/home/runner/work/promql-engine/promql-engine/execution/exchange/concurrent.go:97 +0x181
github.com/thanos-io/promql-engine/execution/exchange.(*concurrencyOperator).Next.func2.gowrap1()
/home/runner/work/promql-engine/promql-engine/execution/exchange/concurrent.go:73 +0x4f
Previous read at 0x00c000bb4630 by goroutine 23249:
runtime.mapIterStart()
/opt/hostedtoolcache/go/1.24.0/x64/src/runtime/map_swiss.go:165 +0x0
github.com/prometheus/prometheus/util/annotations.(*Annotations).Merge()
/home/runner/go/pkg/mod/github.com/prometheus/[email protected]/util/annotations/annotations.go:58 +0x148
github.com/thanos-io/promql-engine/engine.(*compatibilityQuery).Exec.func1()
/home/runner/work/promql-engine/promql-engine/engine/engine.go:539 +0x3f
runtime.deferreturn()
/opt/hostedtoolcache/go/1.24.0/x64/src/runtime/panic.go:605 +0x5d
github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
/home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1185 +0x3e5
github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
/home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1180 +0x15d
github.com/prometheus/prometheus/promql/promqltest.(*test).execInstantEval()
/home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1172 +0x366
runtime.deferreturn()
/opt/hostedtoolcache/go/1.24.0/x64/src/runtime/panic.go:605 +0x5d
github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
/home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1217 +0x124b
github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
/home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1180 +0x15d
github.com/prometheus/prometheus/promql/promqltest.(*test).execInstantEval()
/home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1172 +0x366
runtime.deferreturn()
/opt/hostedtoolcache/go/1.24.0/x64/src/runtime/panic.go:605 +0x5d
github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
/home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1217 +0x124b
github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
/home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1180 +0x15d
github.com/prometheus/prometheus/promql/promqltest.(*test).execInstantEval()
/home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1172 +0x366
runtime.deferreturn()
/opt/hostedtoolcache/go/1.24.0/x64/src/runtime/panic.go:605 +0x5d
github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
/home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1217 +0x124b
github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
/home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1180 +0x15d
github.com/prometheus/prometheus/promql/promqltest.(*test).execInstantEval()
/home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1172 +0x366
runtime.deferreturn()
/opt/hostedtoolcache/go/1.24.0/x64/src/runtime/panic.go:605 +0x5d
github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
/home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1217 +0x124b
github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
/home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1180 +0x15d
github.com/prometheus/prometheus/promql/promqltest.(*test).execInstantEval()
/home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1172 +0x366
runtime.deferreturn()
/opt/hostedtoolcache/go/1.24.0/x64/src/runtime/panic.go:605 +0x5d
github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
/home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1217 +0x124b
github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
/home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1180 +0x15d
github.com/prometheus/prometheus/promql/promqltest.(*test).execInstantEval()
/home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1172 +0x366
github.com/prometheus/prometheus/promql/promqltest.(*test).execEval.func1()
/home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1113 +0xa8
github.com/prometheus/prometheus/promql/promqltest.(*test).execEval.func2()
/home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1122 +0x2f
testing.tRunner()
/opt/hostedtoolcache/go/1.24.0/x64/src/testing/testing.go:1792 +0x225
testing.(*T).Run.gowrap1()
/opt/hostedtoolcache/go/1.24.0/x64/src/testing/testing.go:1851 +0x44
@yeya24 would you mind comparing the implementation to https://github.com/thanos-io/promql-engine/pull/550?
This is the optimizer we already use in production and it has worked well for us so far.
Nice thanks @fpetkovski. Your implementation is much cleaner. I will definitely take a closer look
I tried to implement projection support for merge select and I am able to create a working version in https://github.com/yeya24/promql-engine/commit/6c7b7a206577477e930277e6a61096555ad3018c. It is able to pass correctness tests when I tried it locally.
However, I kind of doubt if it is worth pursuing due to several limitations:
- https://github.com/thanos-io/promql-engine/issues/512#issuecomment-2823200537 Merge select is not aware of the actual selector cache key. The merged projection could ended up fetching more data and no actual select getting merged due to different keys.
- If the selector with least selectivity doesn't have any projection for example
{__name__="http_requests_total"}, we cannot merge projection for queries likecount without(a, b, c) (__name__="http_requests_total", status_code="200") / {__name__="http_requests_total"}. This is because the left expression requires series hash but{__name__="http_requests_total"}selector cannot ask for series hash as there is no projection labels. - Increased complexity
Things can be probably simplified a bit if we assume duplicate label check is disabled. However, this might impact something else of the optimizer and might not work for all cases.
I would propose that we skip merge select optimizer for now for projection even though it is working using the code above. Merge select optimizer should check if we have projection in vector selector. If projection is enabled, don't try to merge select. Or vice versa, don't do projection pushdown if merge select is applied
Changes made to latest commit https://github.com/thanos-io/promql-engine/pull/549/commits/a35726dada1c6e148e5e8321824738a12cdb7078 to don't merge select for vector selector with projection