calcite icon indicating copy to clipboard operation
calcite copied to clipboard

[CALCITE-6592] RelMdPredicates pull up wrong predicate from UNION whe…

Open NobiGo opened this issue 1 year ago • 10 comments

…n it's input include NULL VALUE

NobiGo avatar Sep 24 '24 12:09 NobiGo

Forch-pushed to rerun CI.

NobiGo avatar Sep 25 '24 01:09 NobiGo

If the problem is indeed in Values please add specific tests for this expression and update the JIRA/PR accordingly.

zabetak avatar Sep 25 '24 08:09 zabetak

If the problem is indeed in Values please add specific tests for this expression and update the JIRA/PR accordingly.

@zabetak Hi, It is not a VALUES problem. But When I try to add a test for VALUERS. I find another problem in VALUES. For example:

VALUES(1,2,3)

Predicates should be ($0=1,$1=2,$2=3), but now is $0=1. In my PR I have make it right and add the unit test.

NobiGo avatar Sep 26 '24 01:09 NobiGo

I see that you are performing the changes that we discuss here in another pull request (#3981). This is a bit confusing and I am losing track on where I should comment on.

zabetak avatar Sep 26 '24 09:09 zabetak

I see that you are performing the changes that we discuss here in another pull request (#3981). This is a bit confusing and I am losing track on where I should comment on.

These are two different questions. So I open another PR for VALUES. But we can talk about it together here. These are some of my opinions:

For Values: values(1, 2, 3, null), (1, 2, null, null), (5, 2, 3, null) We pull up the predicate is [=($1, 2), IS NULL($3)]. Do we need to extract [OR(IS NULL($2), =($2, 3))]? I think we don't need it. Because when VALUES size is large, It will be complicated.

NobiGo avatar Sep 26 '24 09:09 NobiGo

Hey @NobiGo are you still planning to move this forward?

zabetak avatar Oct 18 '24 08:10 zabetak

Hey @NobiGo are you still planning to move this forward?

Yes. If I have free time, I will handle it. After fixing the CALCITE-6599(RelMdPredicates should pull up more predicates from VALUES when there are several literals) this PR becomes simple. Because we extract the IS NULL not the [=($0, null)] from the VALUES.

NobiGo avatar Oct 18 '24 13:10 NobiGo

I think that pulling up predicates breaks ASOF joins. The spec requires these joins to have predicates of a very specific shape, where each predicate compares a column on one side with a column on the other side. Pulling up predicates may break this invariant. If there is a quick fix that would prevent ASOF joins from participating in the optimization, maybe it can be included in this PR. Otherwise we need to file a new JIRA issue. I think that attempting to optimize some of the existing ASOF examples in asof.iq will exhibit the error. This problem was not introduced in this PR.

mihaibudiu avatar Oct 18 '24 21:10 mihaibudiu

@zabetak As the unit test, We always extract the predicate IS NULL($1) when null or comm = null so the RexUnknownAs will not affect the results. And I have discovered another logic that can be optimized: select comm = 2 from emp where comm = 2 can pull up =($0, true) select comm = 3 from emp where comm = 2 can pull up =($0, false) select comm = 2 from emp where comm = 2 and empno = 1 can pull up =($0, true) select comm = 2 empno = 1 and from emp where comm = 2 and empno = 1 can pull up =($0, true) Now the pull up predicate is nothing. WDYT?

NobiGo avatar Oct 18 '24 21:10 NobiGo

@NobiGo If I understand well there is no problem when pulling predicates from UNION which is inline with my initial sentiment.

The PR in its current shape just adds extra tests for something that is working so the JIRA/PR title is misleading. We should either update the current JIRA/PR to describe the current situation/contribution or contribute the tests under a completely new JIRA/PR if you feel that they are increasing code coverage.

zabetak avatar Oct 23 '24 13:10 zabetak