spark
spark copied to clipboard
[SPARK-47079][PYTHON][SQL][CONNECT] Add Variant type info to PySpark
What changes were proposed in this pull request?
The Variant datatype was added in https://github.com/apache/spark/pull/43707 but the equivalent PySpark type was not added. In this PR we add Variant to PySpark which allows us to create PySpark dataframes containing the Variant type.
Why are the changes needed?
Without this PR, trying to create a dataframe containing a variant type results in
AssertionError: Undefined error message parameter for error class: CANNOT_PARSE_DATATYPE. Parameters: {'error': "Undefined error message parameter for error class: CANNOT_PARSE_DATATYPE. Parameters: {'error': 'variant'}"}.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Added new PySpark type tests involving Variant.
Was this patch authored or co-authored using generative AI tooling?
No.