datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

`select array_concat([])` panicked

Open jonahgao opened this issue 10 months ago • 8 comments

Describe the bug

thread 'main' panicked at datafusion/datafusion/functions-array/src/concat.rs:275:32: index out of bounds: the len is 0 but the index is 0

To Reproduce

Run any of the following queries in CLI.

DataFusion CLI v37.1.0
> select array_concat([]);
DataFusion CLI v37.1.0
>  select array_concat(make_array());

Expected behavior

No response

Additional context

No response

jonahgao avatar Apr 23 '24 15:04 jonahgao

I can fix this one if no one is working on this

Lordworms avatar Apr 23 '24 17:04 Lordworms

@Lordworms Not sure what is your idea to solve this problem. I expect this to be solved with the correct signature. I had worked on https://github.com/apache/datafusion/pull/8594 before but the solution is quite a specialized case. If you come out with another better idea, it would be great!

jayzhan211 avatar Apr 23 '24 23:04 jayzhan211

select array_concat(make_array());

Hi @jayzhan211 I was going to implement the same behavior like DuckDB image I could directly check the returnType here do match the behavior of DuckDB image

and make datafusion support query like image Don't know if it could meet the issue's need

Lordworms avatar Apr 24 '24 04:04 Lordworms

@Lordworms I think arguments validity check are more like signature instead of one that need to be computed by the user in return_type.

jayzhan211 avatar Apr 24 '24 05:04 jayzhan211

@Lordworms I think arguments validity check are more like signature instead of one that need to be computed by the

Yes, I understand, but if we need to validate argument in signature a suitable way to do it is more likely what you did in https://github.com/apache/datafusion/pull/8594

user in return_type.

I'll try to see if I can come up with a better way, thanks for your advice!

Lordworms avatar Apr 24 '24 14:04 Lordworms

After #10439, I believe array_concat should use user-defined signature given it's speciality, therefore we can check the empty array in coerce_types for array_concat and return exec_error instead of panicking

cc @Lordworms

jayzhan211 avatar May 14 '24 08:05 jayzhan211

I have a PR that updates the array_concat signature, maybe I can resolve the panic issue together.

jayzhan211 avatar May 15 '24 00:05 jayzhan211

Actually, I'm thinking about whether we should change the behavior of array_concat similar to postgres and duckdb. It is one of the earliest array functions that we don't follow what postgres and duckdb's behavior 🤔

jayzhan211 avatar May 19 '24 01:05 jayzhan211

Fixed by #10790

jonahgao avatar Jun 06 '24 01:06 jonahgao