spark
spark copied to clipboard
[SPARK-40001][SQL] Add config to make DEFAULT values in JSON tables mutually exclusive with SQLConf.JSON_GENERATOR_IGNORE_NULL_FIELDS
What changes were proposed in this pull request?
Add config to make DEFAULT values in JSON tables mutually exclusive with SQLConf.JSON_GENERATOR_IGNORE_NULL_FIELDS.
When this new config is true, allow DEFAULT column values with JSON tables when the JSON_GENERATOR_IGNORE_NULL_FIELDS conf is enabled. Otherwise, the two become mutually exclusive. This can be useful to enforce that inserted NULL values are present in storage to differentiate from missing data.
Why are the changes needed?
This can help guard correctness of query results.
Does this PR introduce any user-facing change?
Yes, please see above.
How was this patch tested?
This PR adds new unit test coverage.
Hi @gengliangwang this PR double-checks correctness for column DEFAULT values with another corner case (JSON tables).
Can one of the admins verify this patch?
Hi @HyukjinKwon, thanks for your review, please take another look when ready. @gengliangwang FYI
+1 on having such a new configuration.
Hi @gengliangwang responded to comments, this is ready for another round when ready :)
Thanks, merging to master