Joins on Structs fail at runtime
Describe the bug
If you attempt to join two tables on a struct field, the query will plan it successfully, albeit with the struct equality in a the filter, rather than in the on vector. However, when it runs it fails with "Invalid comparison operation". In particular, it triggers this error from arrow-rs: https://github.com/apache/arrow-rs/blob/db811083669df66992008c9409b743a2e365adb0/arrow-ord/src/cmp.rs#L202.
To Reproduce
I wrote a failing test that just does a self join at https://github.com/apache/arrow-datafusion/compare/35.0.0...ArroyoSystems:arrow-datafusion:bug_report/struct_join_fails_at_execution. The failure message is
thread 'user_defined::user_defined_aggregates::test_struct_join' panicked at datafusion/core/tests/user_defined/user_defined_aggregates.rs:172:60:
called `Result::unwrap()` on an `Err` value: Execution("Fail to build join indices in NestedLoopJoinExec, error:Arrow error: Invalid argument error: Invalid comparison operation: Struct([Field { name: \"value\", data_type: Float64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"time\", data_type: Timestamp(Nanosecond, None), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]) == Struct([Field { name: \"value\", data_type: Float64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"time\", data_type: Timestamp(Nanosecond, None), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }])")
Expected behavior
Either the join should fail at planning, reporting a clear error that joins on structs are not supported or, preferably, datafusion should support joins on two structs of the same type.
Additional context
This comes up with Arroyo where we want to join on time windows, e.g. sliding and tumbling windows.