flink icon indicating copy to clipboard operation
flink copied to clipboard

[FLINK-35427][json] Support fail-on-unknown-field config in json format

Open huhuan1898 opened this issue 1 year ago • 1 comments

What is the purpose of the change

As described in the issue FLINK-35427, this pull request trying to introduce a config in json format, so we can fail the flink job when the input schema evolves.

Copy from FLINK-35427: In many cases, the consumer and producer of message queues come from different teams, or even different companies. As a message consumer, sometimes it is difficult to subscribe updates on message's format, which may result in data loss. We want to ensure the message format strictly matches the schema, if some field is missing or new field is added, it is better to fail the application so that we can notice and fix it quickly. In this case, a fail-on-unknown-field config for JSON format could be very helpful, especially when works with fail-on-missing-field.

Brief change log

  • Add JsonFormatOptions.FAIL_ON_UNKNOWN_FIELD
  • Add unknown field check code in converter classes
  • Add JsonSchemaException, which extends JsonParseException
  • Fix some test cases that were not working properly

Verifying this change

This change added tests and can be verified as follows: JsonFormatFactoryTest#testFailOnUnknownField JsonParserRowDataDeSerSchemaTest#testParsePartialJson JsonRowDataSerDeSchemaTest#testDeserializationUnknownField JsonRowDeserializationSchemaTest#testUnknownNode

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes)
  • The serializers: (don't know)
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? (yes)
  • If yes, how is the feature documented? (docs)

huhuan1898 avatar May 27 '24 10:05 huhuan1898

CI report:

  • 5e295d0d0d2c361e9786973ae13abef49577eee5 Azure: SUCCESS
Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

flinkbot avatar May 27 '24 10:05 flinkbot

This PR is being marked as stale since it has not had any activity in the last 90 days. If you would like to keep this PR alive, please leave a comment asking for a review. If the PR has merge conflicts, update it with the latest from the base branch.

If you are having difficulty finding a reviewer, please reach out to the community, contact details can be found here: https://flink.apache.org/what-is-flink/community/

If this PR is no longer valid or desired, please feel free to close it. If no activity occurs in the next 30 days, it will be automatically closed.

github-actions[bot] avatar Apr 05 '25 06:04 github-actions[bot]

This PR has been closed since it has not had any activity in 120 days. If you feel like this was a mistake, or you would like to continue working on it, please feel free to re-open the PR and ask for a review.

github-actions[bot] avatar May 06 '25 06:05 github-actions[bot]