spark [SPARK-41794][SQL] Add `try_remainder` function and re-enable column tests

[SPARK-41794][SQL] Add `try_remainder` function and re-enable column tests

Open grundprinzip opened this issue 9 months ago • 5 comments

What changes were proposed in this pull request?

As part of re-enabling the ANSI mode tests for Spark Connect, we discovered that we don't have an equivalent for try_* for the remainder of operations. This patch adds the try_remainder function in Scala, Python, and Spark Connect and adds the required testing.

Why are the changes needed?

ANSI and Spark 4

Does this PR introduce any user-facing change?

Yes, it adds the try_remainder function that behaves according to ANSI for division by zero.

How was this patch tested?

Added new UT and E2E tests.

Was this patch authored or co-authored using generative AI tooling?

May 07 '24 10:05 grundprinzip

cc @gengliangwang

May 07 '24 10:05 HyukjinKwon

Could you fix Python linter failure, @grundprinzip ?

./python/pyspark/sql/tests/connect/test_connect_column.py:1029:5: F401 'os' imported but unused
    import os
    ^
1     F401 'os' imported but unused

May 07 '24 17:05 dongjoon-hyun

yes, done.

May 08 '24 07:05 grundprinzip

Thank you.

BTW, the recent test failure is relevant? For me, it looks irrelevant. Could you take a look at?

[info] *** 1 TEST FAILED ***
[error] Failed: Total 10578, Failed 1, Errors 0, Passed 10577, Ignored 29
[error] Failed tests:
[error] 	org.apache.spark.sql.execution.python.PythonStreamingDataSourceSuite
[error] (sql / Test / test) sbt.TestsFailedException: Tests unsuccessful
[error] Total time: 3020 s (50:20), completed May 8, 2024, 1:48:01 PM

May 08 '24 15:05 dongjoon-hyun

@grundprinzip thanks for the work! Let's also mention the new function in https://spark.apache.org/docs/latest/sql-ref-ansi-compliance.html#useful-functions-for-ansi-mode

May 08 '24 16:05 gengliangwang

@gengliangwang @dongjoon-hyun I addressed all comments and added the additional test, PTAL. Thanks!

May 12 '24 18:05 grundprinzip

Thanks for adding the new function! Merging to master.

May 13 '24 17:05 gengliangwang

@grundprinzip Shall we create a new jira for this instead of using https://issues.apache.org/jira/browse/SPARK-41794?

May 13 '24 17:05 gengliangwang

add it to python API references in https://github.com/apache/spark/pull/46566

May 14 '24 02:05 zhengruifeng

spark spark copied to clipboard

[SPARK-41794][SQL] Add `try_remainder` function and re-enable column tests

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

spark
spark copied to clipboard