[FLINK-35427][json] Support fail-on-unknown-field config in json format
What is the purpose of the change
As described in the issue FLINK-35427, this pull request trying to introduce a config in json format, so we can fail the flink job when the input schema evolves.
Copy from FLINK-35427: In many cases, the consumer and producer of message queues come from different teams, or even different companies. As a message consumer, sometimes it is difficult to subscribe updates on message's format, which may result in data loss. We want to ensure the message format strictly matches the schema, if some field is missing or new field is added, it is better to fail the application so that we can notice and fix it quickly. In this case, a fail-on-unknown-field config for JSON format could be very helpful, especially when works with fail-on-missing-field.
Brief change log
- Add JsonFormatOptions.FAIL_ON_UNKNOWN_FIELD
- Add unknown field check code in converter classes
- Add JsonSchemaException, which extends JsonParseException
- Fix some test cases that were not working properly
Verifying this change
This change added tests and can be verified as follows:
JsonFormatFactoryTest#testFailOnUnknownField
JsonParserRowDataDeSerSchemaTest#testParsePartialJson
JsonRowDataSerDeSchemaTest#testDeserializationUnknownField
JsonRowDeserializationSchemaTest#testUnknownNode
Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): (no)
- The public API, i.e., is any changed class annotated with
@Public(Evolving): (yes) - The serializers: (don't know)
- The runtime per-record code paths (performance sensitive): (no)
- Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
- The S3 file system connector: (no)
Documentation
- Does this pull request introduce a new feature? (yes)
- If yes, how is the feature documented? (docs)
CI report:
- 5e295d0d0d2c361e9786973ae13abef49577eee5 Azure: SUCCESS
Bot commands
The @flinkbot bot supports the following commands:@flinkbot run azurere-run the last Azure build
This PR is being marked as stale since it has not had any activity in the last 90 days. If you would like to keep this PR alive, please leave a comment asking for a review. If the PR has merge conflicts, update it with the latest from the base branch.
If you are having difficulty finding a reviewer, please reach out to the community, contact details can be found here: https://flink.apache.org/what-is-flink/community/
If this PR is no longer valid or desired, please feel free to close it. If no activity occurs in the next 30 days, it will be automatically closed.
This PR has been closed since it has not had any activity in 120 days. If you feel like this was a mistake, or you would like to continue working on it, please feel free to re-open the PR and ask for a review.