[SPARK-47094][SQL] SPJ : fix bucket reducer function
What changes were proposed in this pull request?
SPJ compatible bucket issue has an implementation of reducible function. This patch fixes the implementation and make it same as in apache iceberg one.
Why are the changes needed?
With this fix, incompatible number of buckets do not return 1 as GCD, hence the buckets do not reduce to 1 when it used in incompatible number of buckets.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
With unit tests
Was this patch authored or co-authored using generative AI tooling?
No.
@szehon-ho please take a look.
With this fix, incompatible number of buckets do not return 1 as GCD, hence the buckets do not reduce to 1 when it used in incompatible number of buckets.
So previously when it is reduced to 1, is it a correctness issue? Or just performance issue?
With this fix, incompatible number of buckets do not return 1 as GCD, hence the buckets do not reduce to 1 when it used in incompatible number of buckets.
So previously when it is reduced to 1, is it a correctness issue? Or just performance issue?
performance issue, if it reduces to 1, there will be only task doing the work.
@viirya it seems it is a test transform, but good to have a good example
@viirya it seems it is a test transform, but good to have a good example
Oh okay, I didn't see it is test only code.
@viirya please take another look,
cc @huaxingao
Merged to master. Thanks @himadripal @szehon-ho @viirya
Thank you all.