spark icon indicating copy to clipboard operation
spark copied to clipboard

[SPARK-46820][PYTHON] Fix error message regression by restoring `new_msg`

Open itholic opened this issue 1 year ago • 1 comments

What changes were proposed in this pull request?

This PR proposes to fix error message regression by restoring new_msg.

Why are the changes needed?

In the past few PRs, we mistakenly remove new_msg which introduces error message regression.

Does this PR introduce any user-facing change?

No API change, but the user-facing error message is improved

Before

>>> from pyspark.sql.types import StructType, StructField, StringType, IntegerType
>>> schema = StructType([
...     StructField("name", StringType(), nullable=True),
...     StructField("age", IntegerType(), nullable=False)
... ])
>>> df = spark.createDataFrame([(["asd", None])], schema)
pyspark.errors.exceptions.base.PySparkValueError: [CANNOT_BE_NONE] Argument `obj` cannot be None.

After

>>> from pyspark.sql.types import StructType, StructField, StringType, IntegerType
>>> schema = StructType([
...     StructField("name", StringType(), nullable=True),
...     StructField("age", IntegerType(), nullable=False)
... ])
>>> df = spark.createDataFrame([(["asd", None])], schema)
pyspark.errors.exceptions.base.PySparkValueError: field age: This field is not nullable, but got None

How was this patch tested?

The existing CI should pass

Was this patch authored or co-authored using generative AI tooling?

No.

itholic avatar Jan 24 '24 03:01 itholic

Thanks @HyukjinKwon for reviewing. Just fixed regressions from past few PRs, and updated the PR title & description accordingly.

itholic avatar Jan 24 '24 04:01 itholic

Merged to master.

HyukjinKwon avatar Feb 20 '24 02:02 HyukjinKwon