pinot icon indicating copy to clipboard operation
pinot copied to clipboard

Hack to fix some problematic queries

Open gortiz opened this issue 1 year ago • 1 comments

Queries like:

WITH CTE_B AS (
  SELECT 1 AS __ts FROM a GROUP BY __ts
) 
SELECT 1 
FROM CTE_B 
WHERE __ts >= 1 AND __ts < 2

Fail when they are executed. The reason is that very tricky, but mostly is due to:

  • Our rules are not able to get rid of the tautology (1 >= 1 and 1 < 2 is always true), so it is pushed into the leaf stage
  • The leaf stage generates an equivalent V1 query, but does not apply CompileTimeFunctionsInvoker, so these tautologies are not removed from the expression tree
  • V1 row-by-row engine doesn't know how to execute these tautologies because it only expects expressions like col1 > doubleLiteral.

These tautologies are generated in PinotFilterExpandSearchRule. We have two rules that should ideally remove them:

  • PinotEvaluateLiteralRule (which evaluates the expressions using V1 semantics)
  • CoreRules.FILTER_REDUCE_EXPRESSIONS (which evalautes the expressions using Calcite semantics)

Alternatively, we should be able to call CompileTimeFunctionsInvoker rewriter in the leaf stage. This invoker is also using V1 semantics.

But any of these changes works.

  1. PinotEvaluateLiteralRule and CompileTimeFunctionsInvoker partially work because V1 can execute = assuming each operand is a number. Things like 1=1 or '123'='123' but cannot execute 'a' = 'a'.
  2. CoreRules.FILTER_REDUCE_EXPRESSIONS can get rid of the tautologies, but in some scenarios it creates just another SEARCH operation. V1 cannot execute them.

I'm not proud of the solution found in this PR. It basically consist of applying the pair PinotFilterExpandSearchRule, CoreRules.FILTER_REDUCE_EXPRESSIONS twice. Although it seems to be working in all cases I tried, I'm sure there will be cases were this solution is not good enough.

IMO it would be better to:

  1. Have a centralized function register, as suggested by @walterddr long ago.
  2. Support search in V1.

Do you have other ideas on how to fix this problem?

gortiz avatar May 17 '24 14:05 gortiz

Codecov Report

Attention: Patch coverage is 73.33333% with 4 lines in your changes are missing coverage. Please review.

Project coverage is 35.09%. Comparing base (59551e4) to head (a163afd). Report is 458 commits behind head on master.

Files Patch % Lines
.../java/org/apache/pinot/query/QueryEnvironment.java 73.33% 2 Missing and 2 partials :warning:
Additional details and impacted files
@@              Coverage Diff              @@
##             master   #13174       +/-   ##
=============================================
- Coverage     61.75%   35.09%   -26.67%     
+ Complexity      207        6      -201     
=============================================
  Files          2436     2445        +9     
  Lines        133233   134648     +1415     
  Branches      20636    20843      +207     
=============================================
- Hits          82274    47249    -35025     
- Misses        44911    83919    +39008     
+ Partials       6048     3480     -2568     
Flag Coverage Δ
custom-integration1 <0.01% <0.00%> (-0.01%) :arrow_down:
integration <0.01% <0.00%> (-0.01%) :arrow_down:
integration1 <0.01% <0.00%> (-0.01%) :arrow_down:
integration2 0.00% <0.00%> (ø)
java-11 <0.01% <0.00%> (-61.71%) :arrow_down:
java-21 35.09% <73.33%> (-26.54%) :arrow_down:
skip-bytebuffers-false 35.07% <73.33%> (-26.68%) :arrow_down:
skip-bytebuffers-true 35.07% <73.33%> (+7.34%) :arrow_up:
temurin 35.09% <73.33%> (-26.67%) :arrow_down:
unittests 46.55% <73.33%> (-15.20%) :arrow_down:
unittests1 46.55% <73.33%> (-0.34%) :arrow_down:
unittests2 ?

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar May 17 '24 15:05 codecov-commenter