spark
spark copied to clipboard
[SPARK-47007][SQL] SortMap function
What changes were proposed in this pull request?
Adding a new function SortMap
Why are the changes needed?
In order to add the ability to do GROUP BY on map types we first have to be able to sort the maps by their key
Does this PR introduce any user-facing change?
Yes, new function SortMap
How was this patch tested?
With new UTs
Was this patch authored or co-authored using generative AI tooling?
No
Should re-run SPARK_GENERATE_GOLDEN_FILES=1 build/sbt "sql/testOnly *ExpressionsSchemaSuite"
to re-generate golden files
updated the title since it also touch python/r/connect
cc @cloud-fan too
+1, LGTM. Merging to master. Thank you, @stevomitric @stefankandic and @HyukjinKwon @zhengruifeng for review.
Sorry I missed this. Why do we add this public function? Do other systems have it? To support GROUP BY map type, an internal MapSort
expression is sufficient.
Do other systems have it?
@stevomitric @stefankandic Could you check other systems, please.
I can't find it in other systems, and it does not make sense as map elements are order-less. I'm reverting it, please re-submit it without exposing the function publicly.
+1 for Wenchen's decision. Thank you for reverting.
Created new PR which omits creating new map_sort
function - @MaxGekk @cloud-fan