spark-rapids icon indicating copy to clipboard operation
spark-rapids copied to clipboard

[SPARK-39865][SQL] Show proper error messages on the overflow errors of table insert

Open amahussein opened this issue 2 years ago • 1 comments

Spark-context

  • spark-3.4: https://github.com/apache/spark/commit/d5dbe7d4e9
    • Followup: https://github.com/apache/spark/commit/9f9a0a4507
  • spark-3.3: https://github.com/apache/spark/commit/19991047d5

What changes were proposed in SPARK pull request?

The table insertion is using the same CAST expression, but the error message was not improved following changes done in ANSI CAST error handling

Old error message:

> create table tiny(i tinyint);
> insert into tiny values (1000);

org.apache.spark.SparkArithmeticException[CAST_OVERFLOW]: The value 1000 of the type "INT" cannot 
be cast to "TINYINT" due to an overflow. Use `try_cast` to tolerate overflow and return NULL instead.
If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.

New Message:

org.apache.spark.SparkArithmeticException: [CAST_OVERFLOW_IN_TABLE_INSERT] Fail to insert a value of
"INT" type into the "TINYINT" type column `i` due to an overflow. Use `try_cast` on the input value to tolerate
overflow and return NULL instead.

Why are the changes needed in Spark?

Current error message is confusing and showing the hint of If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error doesn't help.

Does this PR introduce any user-facing change?

  • Yes, change of exception message

Why it might affect RAPIDS?

  • The plugin needs to check the datatype of the cast during table insertion and check for overflow.
  • AnsiCastOpSuite.doTableInsert()(https://github.com/NVIDIA/spark-rapids/blob/branch-22.08/tests/src/test/scala/com/nvidia/spark/rapids/AnsiCastOpSuite.scala#L741) will be affected by the change as the exception message is different.

Impact on Testing?

Yes.

  • Update AnsiCastOpSuite
  • May need fix to the exception messages

Requires Doc update?

No.

amahussein avatar Aug 01 '22 20:08 amahussein

This change went into Spark 3.3.1, so will need to be fixed before we support that version.

sameerz avatar Aug 02 '22 20:08 sameerz

@amahussein I am not seeing the old message when I run the given script against Spark 3.3.2, instead I see the new message. In addition to that, all the overflow tests in AnsiCastOpSuite are passing. Is this still a valid issue?

razajafri avatar Oct 25 '22 00:10 razajafri

#6256 added support to the OverflowInTableInsert and the tests were modified to pass that check. This issue was opened to revisit the behavior including handling the exception and the error. If you find that the plugin behavior is identical to spark3.3 and spark3.4. Then feel free to close the issue.

amahussein avatar Oct 25 '22 19:10 amahussein