spark-rapids icon indicating copy to clipboard operation
spark-rapids copied to clipboard

[FEA] Add `GpuMapConcat` support for nested (array, struct, map) types.

Open HaoYang670 opened this issue 3 years ago • 2 comments

Is your feature request related to a problem? Please describe. This is a follow on issue of https://github.com/NVIDIA/spark-rapids/pull/5533.

Describe the solution you'd like GpuMapConcat should support arbitrary nested types for values once https://github.com/rapidsai/cudf/pull/10890 is merged. For keys we need https://github.com/rapidsai/cudf/issues/11093 assuming that it also updates drop_list_duplicates. These do not have to be done at the same time. We can split them up into separate PRs for keys and values:

  • [x] #5686
  • [ ] #6290. (Blocked by https://github.com/rapidsai/cudf/issues/11093 )

HaoYang670 avatar May 20 '22 07:05 HaoYang670

I removed GpuConcat because it is covered by #5542

revans2 avatar May 20 '22 14:05 revans2

We can split them up into separate PRs for keys and values.

Hi @revans2, I am a little confused about this comment. Do you mean that we could also support nested keys? But are nested types (such as array, struct and map) hashable?

HaoYang670 avatar May 28 '22 08:05 HaoYang670

Hi @HaoYang670 , seems https://github.com/rapidsai/cudf/issues/11093 has been closed. Can we support the nested type keys now?

GaryShen2008 avatar Aug 19 '22 12:08 GaryShen2008

Close this issue as we have completed the tasks in the checklist.

HaoYang670 avatar Aug 26 '22 23:08 HaoYang670