spark icon indicating copy to clipboard operation
spark copied to clipboard

[SPARK-47007][SQL] SortMap function

Open stefankandic opened this issue 1 year ago • 1 comments

What changes were proposed in this pull request?

Adding a new function SortMap

Why are the changes needed?

In order to add the ability to do GROUP BY on map types we first have to be able to sort the maps by their key

Does this PR introduce any user-facing change?

Yes, new function SortMap

How was this patch tested?

With new UTs

Was this patch authored or co-authored using generative AI tooling?

No

stefankandic avatar Feb 08 '24 13:02 stefankandic

Should re-run SPARK_GENERATE_GOLDEN_FILES=1 build/sbt "sql/testOnly *ExpressionsSchemaSuite" to re-generate golden files

LuciferYang avatar Feb 10 '24 02:02 LuciferYang

updated the title since it also touch python/r/connect

zhengruifeng avatar Mar 07 '24 04:03 zhengruifeng

cc @cloud-fan too

HyukjinKwon avatar Mar 14 '24 01:03 HyukjinKwon

+1, LGTM. Merging to master. Thank you, @stevomitric @stefankandic and @HyukjinKwon @zhengruifeng for review.

MaxGekk avatar Mar 20 '24 04:03 MaxGekk

Sorry I missed this. Why do we add this public function? Do other systems have it? To support GROUP BY map type, an internal MapSort expression is sufficient.

cloud-fan avatar Mar 20 '24 10:03 cloud-fan

Do other systems have it?

@stevomitric @stefankandic Could you check other systems, please.

MaxGekk avatar Mar 20 '24 19:03 MaxGekk

I can't find it in other systems, and it does not make sense as map elements are order-less. I'm reverting it, please re-submit it without exposing the function publicly.

cloud-fan avatar Mar 21 '24 02:03 cloud-fan

+1 for Wenchen's decision. Thank you for reverting.

dongjoon-hyun avatar Mar 21 '24 03:03 dongjoon-hyun

Created new PR which omits creating new map_sort function - @MaxGekk @cloud-fan

stevomitric avatar Mar 21 '24 13:03 stevomitric