calcite
calcite copied to clipboard
[CALCITE-4188] Support EnumerableBatchNestedLoopJoin inJdbcToEnumerab…
https://issues.apache.org/jira/browse/CALCITE-4188
Dear @amaliujia, do you have any more comments for this PR?
@angelzouxin will take a look soon. Let's see whether we can commit this PR to 1.26.0
@amaliujia Do you think this should be in the 1.26.0 ? And do you have time to review it ?
@danny0405 my impression is 1.26.0 will be cut by end of this week. I will take a look at this PR soon but it seems that I need more context to understand about this change so I think it might not be feasible to merge this PR before 1.26.0.
I have removed the 1.26.0 from this JIRA (I added that tag before).
I think the PR is in pretty good shape and we should try to get it in for the next release. Are you planning to push it forward @amaliujia or should I take over?
@zabetak please feel free to review this PR and merge it when you think it is ready.
This is a super valuable PR, just wanted to bring it up again and also give some feedback on it:
I rebased this locally off of main today and built it, it works except in certain odd circumstances. For instance, here's the following query being run:
String query = "SELECT dname, MULTISET(" +
" SELECT ename, MULTISET(" +
" SELECT 1 FROM (VALUES (10), (20), (30), (40)) AS T(d_no) WHERE t.d_no IN (10, 20)" +
" ) FROM emp AS e WHERE deptno = dept.deptno" +
") FROM dept";

It also worked with:
String query2 = "select\n" +
" \"first_name\",\n" +
" multiset(\n" +
" select \"store_name\", multiset(\n" +
" select \"fname\"\n" +
" from \"customer\"\n" +
" where \"customer\".\"customer_region_id\" <> 0 \n" +
" ) \n" +
" from \"store\"\n" +
" where \"store_id\" = \"employee\".\"store_id\"\n" +
" )\n" +
"from \"employee\"\n" +
"limit 2";
What I have noticed though is that if the multiset returns an empty relation in a correlated subquery, you get an error about $cor0 variable being null.
To trigger this, you can change the where customer.customer_region_id clause to be:
where "customer"."customer_region_id" = "store"."region_id"
Running that will give you:
Correlation variable $cor0 should be defined
java.lang.AssertionError: Correlation variable $cor0 should be defined
at org.apache.calcite.adapter.enumerable.EnumerableRelImplementor.getCorrelVariableGetter(EnumerableRelImplementor.java:472)
at org.apache.calcite.adapter.jdbc.JdbcToEnumerableConverter.generateCorrelate(JdbcToEnumerableConverter.java:351)
at org.apache.calcite.adapter.jdbc.JdbcToEnumerableConverter.implement(JdbcToEnumerableConverter.java:188)
at org.apache.calcite.adapter.enumerable.EnumerableRelImplementor.visitChild(EnumerableRelImplementor.java:104)
at org.apache.calcite.adapter.enumerable.EnumerableCorrelate.implement(EnumerableCorrelate.java:113)
at org.apache.calcite.adapter.enumerable.EnumerableRelImplementor.visitChild(EnumerableRelImplementor.java:104)
at org.apache.calcite.adapter.enumerable.EnumerableCalc.implement(EnumerableCalc.java:118)
at org.apache.calcite.adapter.enumerable.EnumerableRelImplementor.visitChild(EnumerableRelImplementor.java:104)
at org.apache.calcite.adapter.enumerable.EnumerableCollect.implement(EnumerableCollect.java:81)
at org.apache.calcite.adapter.enumerable.EnumerableRelImplementor.visitChild(EnumerableRelImplementor.java:104)
at org.apache.calcite.adapter.enumerable.EnumerableNestedLoopJoin.implement(EnumerableNestedLoopJoin.java:160)
at org.apache.calcite.adapter.enumerable.EnumerableRelImplementor.visitChild(EnumerableRelImplementor.java:104)
at org.apache.calcite.adapter.enumerable.EnumerableLimit.implement(EnumerableLimit.java:98)
at org.apache.calcite.adapter.enumerable.EnumerableRelImplementor.visitChild(EnumerableRelImplementor.java:104)
at org.apache.calcite.adapter.enumerable.EnumerableCalc.implement(EnumerableCalc.java:118)
at org.apache.calcite.adapter.enumerable.EnumerableRelImplementor.implementRoot(EnumerableRelImplementor.java:111)
at org.apache.calcite.adapter.enumerable.EnumerableInterpretable.toBindable(EnumerableInterpretable.java:114)
at org.apache.calcite.prepare.CalcitePrepareImpl$CalcitePreparingStmt.implement(CalcitePrepareImpl.java:1131)
at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:324)
And there is one more error which is less understandable to me, you can produce this one by modifying the emps deps query to:
String query = "SELECT dname, MULTISET(" +
" SELECT ename, MULTISET(" +
" SELECT * FROM dept AS d WHERE d.deptno = e.deptno" +
" ) FROM emp AS e WHERE deptno = dept.deptno" +
") FROM dept";
java.sql.SQLException: Error while executing SQL: Index 7 out of bounds for length 3
...
Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 7 out of bounds for length 3
at com.google.common.collect.RegularImmutableList.get(RegularImmutableList.java:77)
at org.apache.calcite.rel.core.TableScan.project(TableScan.java:164)
at org.apache.calcite.sql2rel.RelFieldTrimmer.trimFields(RelFieldTrimmer.java:1282)
... 66 more
I would offer to fix these and push this PR through to finished, but I still don't really have a clue about how Calcite works under the hood and that fancy relational algebra stuff =/
Hope this info is useful at least!