datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

Joins on Structs fail at runtime

Open jacksonrnewhouse opened this issue 1 year ago • 0 comments

Describe the bug

If you attempt to join two tables on a struct field, the query will plan it successfully, albeit with the struct equality in a the filter, rather than in the on vector. However, when it runs it fails with "Invalid comparison operation". In particular, it triggers this error from arrow-rs: https://github.com/apache/arrow-rs/blob/db811083669df66992008c9409b743a2e365adb0/arrow-ord/src/cmp.rs#L202.

To Reproduce

I wrote a failing test that just does a self join at https://github.com/apache/arrow-datafusion/compare/35.0.0...ArroyoSystems:arrow-datafusion:bug_report/struct_join_fails_at_execution. The failure message is

thread 'user_defined::user_defined_aggregates::test_struct_join' panicked at datafusion/core/tests/user_defined/user_defined_aggregates.rs:172:60:
called `Result::unwrap()` on an `Err` value: Execution("Fail to build join indices in NestedLoopJoinExec, error:Arrow error: Invalid argument error: Invalid comparison operation: Struct([Field { name: \"value\", data_type: Float64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"time\", data_type: Timestamp(Nanosecond, None), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]) == Struct([Field { name: \"value\", data_type: Float64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"time\", data_type: Timestamp(Nanosecond, None), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }])")

Expected behavior

Either the join should fail at planning, reporting a clear error that joins on structs are not supported or, preferably, datafusion should support joins on two structs of the same type.

Additional context

This comes up with Arroyo where we want to join on time windows, e.g. sliding and tumbling windows.

jacksonrnewhouse avatar Feb 16 '24 20:02 jacksonrnewhouse