spark-rapids
spark-rapids copied to clipboard
[FEA] Add `GpuMapConcat` support for nested (array, struct, map) types.
Is your feature request related to a problem? Please describe. This is a follow on issue of https://github.com/NVIDIA/spark-rapids/pull/5533.
Describe the solution you'd like
GpuMapConcat should support arbitrary nested types for values once https://github.com/rapidsai/cudf/pull/10890 is merged. For keys we need https://github.com/rapidsai/cudf/issues/11093 assuming that it also updates drop_list_duplicates. These do not have to be done at the same time. We can split them up into separate PRs for keys and values:
- [x] #5686
- [ ] #6290. (Blocked by https://github.com/rapidsai/cudf/issues/11093 )
I removed GpuConcat because it is covered by #5542
We can split them up into separate PRs for keys and values.
Hi @revans2, I am a little confused about this comment. Do you mean that we could also support nested keys? But are nested types (such as array, struct and map) hashable?
Hi @HaoYang670 , seems https://github.com/rapidsai/cudf/issues/11093 has been closed. Can we support the nested type keys now?
Close this issue as we have completed the tasks in the checklist.