spark
spark copied to clipboard
[SPARK-46820][PYTHON] Fix error message regression by restoring `new_msg`
What changes were proposed in this pull request?
This PR proposes to fix error message regression by restoring new_msg.
Why are the changes needed?
In the past few PRs, we mistakenly remove new_msg which introduces error message regression.
Does this PR introduce any user-facing change?
No API change, but the user-facing error message is improved
Before
>>> from pyspark.sql.types import StructType, StructField, StringType, IntegerType
>>> schema = StructType([
... StructField("name", StringType(), nullable=True),
... StructField("age", IntegerType(), nullable=False)
... ])
>>> df = spark.createDataFrame([(["asd", None])], schema)
pyspark.errors.exceptions.base.PySparkValueError: [CANNOT_BE_NONE] Argument `obj` cannot be None.
After
>>> from pyspark.sql.types import StructType, StructField, StringType, IntegerType
>>> schema = StructType([
... StructField("name", StringType(), nullable=True),
... StructField("age", IntegerType(), nullable=False)
... ])
>>> df = spark.createDataFrame([(["asd", None])], schema)
pyspark.errors.exceptions.base.PySparkValueError: field age: This field is not nullable, but got None
How was this patch tested?
The existing CI should pass
Was this patch authored or co-authored using generative AI tooling?
No.
Thanks @HyukjinKwon for reviewing. Just fixed regressions from past few PRs, and updated the PR title & description accordingly.
Merged to master.