promql-engine icon indicating copy to clipboard operation
promql-engine copied to clipboard

Add projection pushdown optimizer

Open yeya24 opened this issue 9 months ago • 5 comments

Fixes #496

This PR adds an implementation of Projection Pushdown optimizer which checks the expression and injects projection to vector selector accordingly.

Note:

  • It supports projection pushdown for aggregation.
  • It supports projection pushdown for one to one binary expression. Also support one-to-many and many-to-one binary expressions mainly on the low cardinality side. Thanks for the inspiration from https://github.com/thanos-io/promql-engine/pull/550
  • It supports functions like label_replace and label_join. And other functions such as scalar, absent and absent_over_time by inspiration from https://github.com/thanos-io/promql-engine/pull/550
  • Changes made to skip merging select if the vector selector has projection. For details of why we don't do a full support for merge projection, see https://github.com/thanos-io/promql-engine/pull/549#issuecomment-2823226765
  • Storage layer needs to return a label of series hash for proper deduplication.

It is able to pass promqlsmith based fuzz test locally by using a mock implementation of projection querier.

yeya24 avatar Apr 08 '25 05:04 yeya24

This is unrelated to this change but seems a valid data race.

WARNING: DATA RACE
Write at 0x00c000bb4630 by goroutine 23442:
  runtime.mapassign()
      /opt/hostedtoolcache/go/1.24.0/x64/src/internal/runtime/maps/runtime_swiss.go:191 +0x0
  github.com/prometheus/prometheus/util/annotations.(*Annotations).Add()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/util/annotations/annotations.go:45 +0x169
  github.com/thanos-io/promql-engine/execution/warnings.(*warnings).add()
      /home/runner/work/promql-engine/promql-engine/execution/warnings/context.go:29 +0x9d
  github.com/thanos-io/promql-engine/execution/warnings.AddToContext()
      /home/runner/work/promql-engine/promql-engine/execution/warnings/context.go:53 +0x6a
  github.com/thanos-io/promql-engine/storage/prometheus.(*matrixSelector).Next()
      /home/runner/work/promql-engine/promql-engine/storage/prometheus/matrix_selector.go:164 +0x3aa
  github.com/thanos-io/promql-engine/execution/exchange.(*concurrencyOperator).pull()
      /home/runner/work/promql-engine/promql-engine/execution/exchange/concurrent.go:97 +0x181
  github.com/thanos-io/promql-engine/execution/exchange.(*concurrencyOperator).Next.func2.gowrap1()
      /home/runner/work/promql-engine/promql-engine/execution/exchange/concurrent.go:73 +0x4f

Previous read at 0x00c000bb4630 by goroutine 23249:
  runtime.mapIterStart()
      /opt/hostedtoolcache/go/1.24.0/x64/src/runtime/map_swiss.go:165 +0x0
  github.com/prometheus/prometheus/util/annotations.(*Annotations).Merge()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/util/annotations/annotations.go:58 +0x148
  github.com/thanos-io/promql-engine/engine.(*compatibilityQuery).Exec.func1()
      /home/runner/work/promql-engine/promql-engine/engine/engine.go:539 +0x3f
  runtime.deferreturn()
      /opt/hostedtoolcache/go/1.24.0/x64/src/runtime/panic.go:605 +0x5d
  github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1185 +0x3e5
  github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1180 +0x15d
  github.com/prometheus/prometheus/promql/promqltest.(*test).execInstantEval()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1172 +0x366
  runtime.deferreturn()
      /opt/hostedtoolcache/go/1.24.0/x64/src/runtime/panic.go:605 +0x5d
  github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1217 +0x124b
  github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1180 +0x15d
  github.com/prometheus/prometheus/promql/promqltest.(*test).execInstantEval()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1172 +0x366
  runtime.deferreturn()
      /opt/hostedtoolcache/go/1.24.0/x64/src/runtime/panic.go:605 +0x5d
  github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1217 +0x124b
  github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1180 +0x15d
  github.com/prometheus/prometheus/promql/promqltest.(*test).execInstantEval()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1172 +0x366
  runtime.deferreturn()
      /opt/hostedtoolcache/go/1.24.0/x64/src/runtime/panic.go:605 +0x5d
  github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1217 +0x124b
  github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1180 +0x15d
  github.com/prometheus/prometheus/promql/promqltest.(*test).execInstantEval()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1172 +0x366
  runtime.deferreturn()
      /opt/hostedtoolcache/go/1.24.0/x64/src/runtime/panic.go:605 +0x5d
  github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1217 +0x124b
  github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1180 +0x15d
  github.com/prometheus/prometheus/promql/promqltest.(*test).execInstantEval()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1172 +0x366
  runtime.deferreturn()
      /opt/hostedtoolcache/go/1.24.0/x64/src/runtime/panic.go:605 +0x5d
  github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1217 +0x124b
  github.com/prometheus/prometheus/promql/promqltest.(*test).runInstantQuery()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1180 +0x15d
  github.com/prometheus/prometheus/promql/promqltest.(*test).execInstantEval()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1172 +0x366
  github.com/prometheus/prometheus/promql/promqltest.(*test).execEval.func1()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1113 +0xa8
  github.com/prometheus/prometheus/promql/promqltest.(*test).execEval.func2()
      /home/runner/go/pkg/mod/github.com/prometheus/[email protected]/promql/promqltest/test.go:1122 +0x2f
  testing.tRunner()
      /opt/hostedtoolcache/go/1.24.0/x64/src/testing/testing.go:1792 +0x225
  testing.(*T).Run.gowrap1()
      /opt/hostedtoolcache/go/1.24.0/x64/src/testing/testing.go:1851 +0x44

yeya24 avatar Apr 08 '25 23:04 yeya24

@yeya24 would you mind comparing the implementation to https://github.com/thanos-io/promql-engine/pull/550?

This is the optimizer we already use in production and it has worked well for us so far.

fpetkovski avatar Apr 11 '25 06:04 fpetkovski

Nice thanks @fpetkovski. Your implementation is much cleaner. I will definitely take a closer look

yeya24 avatar Apr 11 '25 07:04 yeya24

I tried to implement projection support for merge select and I am able to create a working version in https://github.com/yeya24/promql-engine/commit/6c7b7a206577477e930277e6a61096555ad3018c. It is able to pass correctness tests when I tried it locally.

However, I kind of doubt if it is worth pursuing due to several limitations:

  • https://github.com/thanos-io/promql-engine/issues/512#issuecomment-2823200537 Merge select is not aware of the actual selector cache key. The merged projection could ended up fetching more data and no actual select getting merged due to different keys.
  • If the selector with least selectivity doesn't have any projection for example {__name__="http_requests_total"}, we cannot merge projection for queries like count without(a, b, c) (__name__="http_requests_total", status_code="200") / {__name__="http_requests_total"}. This is because the left expression requires series hash but {__name__="http_requests_total"} selector cannot ask for series hash as there is no projection labels.
  • Increased complexity

Things can be probably simplified a bit if we assume duplicate label check is disabled. However, this might impact something else of the optimizer and might not work for all cases.

I would propose that we skip merge select optimizer for now for projection even though it is working using the code above. Merge select optimizer should check if we have projection in vector selector. If projection is enabled, don't try to merge select. Or vice versa, don't do projection pushdown if merge select is applied

yeya24 avatar Apr 23 '25 06:04 yeya24

Changes made to latest commit https://github.com/thanos-io/promql-engine/pull/549/commits/a35726dada1c6e148e5e8321824738a12cdb7078 to don't merge select for vector selector with projection

yeya24 avatar Apr 23 '25 15:04 yeya24