spark icon indicating copy to clipboard operation
spark copied to clipboard

[SPARK-40001][SQL] Add config to make DEFAULT values in JSON tables mutually exclusive with SQLConf.JSON_GENERATOR_IGNORE_NULL_FIELDS

Open dtenedor opened this issue 2 years ago • 2 comments

What changes were proposed in this pull request?

Add config to make DEFAULT values in JSON tables mutually exclusive with SQLConf.JSON_GENERATOR_IGNORE_NULL_FIELDS.

When this new config is true, allow DEFAULT column values with JSON tables when the JSON_GENERATOR_IGNORE_NULL_FIELDS conf is enabled. Otherwise, the two become mutually exclusive. This can be useful to enforce that inserted NULL values are present in storage to differentiate from missing data.

Why are the changes needed?

This can help guard correctness of query results.

Does this PR introduce any user-facing change?

Yes, please see above.

How was this patch tested?

This PR adds new unit test coverage.

dtenedor avatar Aug 08 '22 00:08 dtenedor

Hi @gengliangwang this PR double-checks correctness for column DEFAULT values with another corner case (JSON tables).

dtenedor avatar Aug 08 '22 00:08 dtenedor

Can one of the admins verify this patch?

AmplabJenkins avatar Aug 08 '22 01:08 AmplabJenkins

Hi @HyukjinKwon, thanks for your review, please take another look when ready. @gengliangwang FYI

dtenedor avatar Aug 11 '22 20:08 dtenedor

+1 on having such a new configuration.

gengliangwang avatar Aug 12 '22 17:08 gengliangwang

Hi @gengliangwang responded to comments, this is ready for another round when ready :)

dtenedor avatar Aug 12 '22 19:08 dtenedor

Thanks, merging to master

gengliangwang avatar Aug 13 '22 17:08 gengliangwang