[FEA] Implement `outputPartitioning` for GPU join execs

Open andygrove opened this issue 1 year ago • 0 comments

Is your feature request related to a problem? Please describe.

Spark's BroadcastHashJoinExec, ShuffleHashJoinExec, and BroadcastNestedLoopJoinExec classes implement outputPartitioning, but our GPU implementations do not. This could potentially lead to missed optimizations.

Describe the solution you'd like

Add failing tests to compare GPU vs CPU join plans to ensure they have the same output partitioning
Implement outputPartitioning

Describe alternatives you've considered

Additional context

Mar 18 '24 14:03 andygrove