[SPARK-48719][SQL] Fix the calculation bug of `RegrSlope` & `RegrIntercept` when the first parameter is null
What changes were proposed in this pull request?
This PR aims to fix the calculation bug of RegrSlope&RegrIntercept` when the first parameter is null. Regardless of whether the first parameter(y) or the second parameter(x) is null, this tuple should be filtered out.
Why are the changes needed?
Fix bug.
Does this PR introduce any user-facing change?
Yes, the calculation changes when the first value of a tuple is null, but the value is truly correct.
How was this patch tested?
Pass GA and test with build/sbt "~sql/testOnly org.apache.spark.sql.SQLQueryTestSuite -- -z linear-regression.sql"
Was this patch authored or co-authored using generative AI tooling?
No.
cc @beliefer
Gentle ping @HyukjinKwon, when you have time.
seems ok, cc @beliefer and @cloud-fan
thanks, merging to master!
can you open a backport PR for 3.5?
can you open a backport PR for 3.5?
Ok, let me do it. Thank you for review. @cloud-fan