calcite
calcite copied to clipboard
[CALCITE-3513] Unify TableFunction implementor's NullPolicy and its behavior
Details are illustrated in CALCITE-3513, Table function's implementation should be NullPolicy.NONE
.
hi @DonnyZone This modification is more rigorous than before, and the scope is even narrower. Is there any reason for this change?
@XuQianJin-Stars I'm refactoring the codegen component now and try to unify the inconsistencies buried under current codegen framework. This PR need to do more work. I will update it these couple of days.
Spelling: behavior not beheavior
ScalarFunctionImpl
determines NullPolicy based on annotations, while TableFunctionImpl
sets NullPolicy=ANY
by default for all cases.
However, according to the javadoc in TableFunction
, null
arguments are allowed.
/**
* Returns the row type of the table yielded by this function when
* applied to given arguments. Only literal arguments are passed,
* non-literal are replaced with default values (null, 0, false, etc).
*
* @param arguments arguments of a function call (only literal arguments
* are passed, nulls for non-literal ones)
* @return element type of the table (e.g. {@code Object[].class})
*/
Type getElementType(List<Object> arguments);
Hi, @julianhyde,
Current now, I adopt the approach as that in ScalarFunctionImpl
to dynamically determines the NullPolicy
based on annotation.
But as shown in TableFunctionTests
, it seems not to be reasonable when a table function returns null
. The code generated (org.apache.calcite.runtime.Enumerables.slice0(null)
) will produce NPE.
@vlsi Thanks for review. Actually, I do not have a clear direction for this work now. Let me clarify it and look forward to hearing your advice.
At the beginning, I found that NullPolicy.ANY
is not appropriate for all TableFunctions.
For example, the query in TableFunctionTest#testMultipleScannableTableFunctionWithNamedParameters`.
select * from table("s"."Maze"(HEIGHT => 3, WIDTH => 5)`
It passes null
argument (i.e., org.apache.calcite.util.Smalls.MazeTable.generate2(5, 3, null)
), but the result is not null
.
Then, I think NullPolicy.NONE
may be more suitable. But it prevents users to define their own NullPolicy in their implementations.
In the end, I inclined to adopt the approach (getNullPolicy(Method m)
) as that in ScalarFunctionImp
.
But I found that we do not have any standards to handle the case when a table function returns null
, as the NPE in tests. Therefore, NullPolicy.ANY/STRICT/ARG0
seems to be meaningless.
Ok, so your main question is "what should Calcite do when a table function returns NPE"?
Oracle makes it indistinguishable from an empty collection.
For instance:
select * from dual, table(cast(null as sys.odcinumberlist)) t;
DUMMY | COLUMN_VALUE
0 rows
select * from dual, table(sys.odcinumberlist()) t;
DUMMY | COLUMN_VALUE
0 rows
select * from dual, table(cast(null as sys.odcinumberlist))(+) t;
DUMMY | COLUMN_VALUE
Y | [NULL]
1 row
select * from dual, table(sys.odcinumberlist())(+) t;
DUMMY | COLUMN_VALUE
Y | [NULL]
1 row
Therefore, NullPolicy.ANY/STRICT/ARG0 seems to be meaningless
It is probably more to optimization and/or prevent invocation of the function on invalid inputs.
For instance, if NullPolicy is set to ANY
, then enumerable executor won't invoke the function in case there are null arguments.
On the other hand, treating null
table function result as an empty table is a different issue.
Is there any calcite connection to run the table function Because we are running this function in windows 7 operating system through sqlline. As sqlline.bat sqlline>!connect jdbc:calcite: admin admin
This is good discussion, but can if you are changing the specification, can you move any conclusions over to the JIRA case?
I think that table functions should behave the same as regular functions with regard to null arguments.
IIRC, there are annotations that you can put against a UDF to control whether a null argument means a null result. If so, those annotations should also apply to table functions.
I agree with @vlsi that if a table function returns null it should be interpreted in SQL as an empty relation. (Vladimir, did you check what happens if a table function returns null and it is an argument to a consuming table function? I would guess that the consuming table function sees an empty, but not null, argument.)
@34venu I'm not familiar with sqlline. The best option is to send email to the developers list (https://calcite.apache.org/community/#mailing-lists).
Vladimir, did you check what happens if a table function returns null and it is an argument to a consuming table function?
null
argument is just a null argument. Arguments are never converted to empty relations.
Note: Oracle schema does not make the distinction between regular functions
and table functions
. A function can be used as a table function if it returns a collection.
However, the very same function can be used in SELECT
or in PL/SQL
, and null
values are allowed for collections.
So I think it should be fine to return null from Calcite's table function, however, the executor should treat table(null)
as an empty relation of the appropriate type.