Support nulls and empty for array functions
Which issue does this PR close?
Closes #7142. Re-open 7142 if others than append/prepend need to support nulls and empty.
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?
@izveigor @alamb Ready for review, thanks!
I'll put this on my list -- thanks @jayzhan211
I have extensively restructured the code in order to enhance its readability and extensibility. Also, array_concat is considered.
I'm sure this PR can be further improved but it is time to seek feedback to prevent me from working in the wrong direction.
cc @alamb @izveigor
Hello, @jayzhan211! Sorry for the long delay. I will carefully review these changes tomorrow.
However, I have some questions about the nature of functions in Arrow Datafusion. I recently thought about solving the issue: https://github.com/apache/arrow-datafusion/issues/6559 and I noticed a fact that User Defined Functions in DataFusion does not parameterize the null handling behaviour (Unlike DuckDB with special link) So If we want to implement new array or other nested data types functions (like struct_insert) maybe it is better to remake signature's structure (I mean adding new boolean argument special to Signature).
I want to hear @alamb's opinion about this opportunity.
Hello, @jayzhan211! Sorry for the long delay. I will carefully review these changes tomorrow.
However, I have some questions about the nature of functions in Arrow Datafusion. I recently thought about solving the issue: https://github.com/apache/arrow-datafusion/issues/6559 and I noticed a fact that User Defined Functions in DataFusion does not parameterize the null handling behaviour (Unlike DuckDB with
speciallink) So If we want to implement new array or other nested data types functions (likestruct_insert) maybe it is better to remake signature's structure (I mean adding new boolean argumentspecialtoSignature).I want to hear @alamb's opinion about this opportunity.
I'm not familiar with udf. Is it possible to customize the behavior of null in udf? Why do we need additional parameters to deal with null?
In this example,
why not pass Option<Fn>: dont_intercept_null to indicate whether customized null handling? None for default, Some for UDF.
duckdb.create_function('dont_intercept', dont_intercept_null, [BIGINT], BIGINT)
res = duckdb.sql("""
select dont_intercept(NULL)
""").fetchall()
print(res)
# [(None,)]
duckdb.remove_function('dont_intercept')
duckdb.create_function('dont_intercept', dont_intercept_null, [BIGINT], BIGINT, null_handling='special')
res = duckdb.sql("""
select dont_intercept(NULL)
""").fetchall()
print(res)
I plan to review this PR later -- I am on vacation this week so my response will likely be delayed
Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days.