[FLINK-28185][Connector/Kafka] Make TimestampOffsetsInitializer apply offset reset str…
…ategy and handle timestamps that do not map to an offset
What is the purpose of the change
This change improves the TimestampOffsetsInitializer to be initialized with a configured offset reset strategy. The default behavior (LATEST) is preserved. This also fixes a bug for when the timestamp does not correspond to an offset in Kafka and clarifies the exception message that is thrown.
Brief change log
- Handles EARLIEST/LATEST/NONE
- For timestamps that do not correspond to an offset in Kafka and if configured with NONE, the initializer will throw an explicit exception.
Verifying this change
This change added tests and can be verified as follows:
- Added unit test to test the various offset reset strategies and the edge case where timestamp does not correspond to an offset in Kafka
Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): no
- The public API, i.e., is any changed class annotated with
@Public(Evolving): yes - The serializers: no
- The runtime per-record code paths (performance sensitive): no
- Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
- The S3 file system connector: no
Documentation
- Does this pull request introduce a new feature? yes
- If yes, how is the feature documented? JavaDocs
CI report:
- 9ae1f5be5a48832c9703f087591602edd5c1b64b Azure: SUCCESS
Bot commands
The @flinkbot bot supports the following commands:@flinkbot run azurere-run the last Azure build
@PatrickRen can you help review? I can't request reviewers unfortunately :)
I also encountered this exception. I read Kafka data in batches.Therefore, we want to use latest when the timestamp is exceeded.
@mas-chen There are a couple of holidays coming up, but I've asked @PatrickRen for a review
What's the status of this PR?
@MartijnVisser @PatrickRen sorry for the delay, feedback should be addressed. Will followup with the public API change separately.
@PatrickRen have a chance to take another look?
@PatrickRen Thanks, the comments should be addressed! Regarding your earlier comment about the Public API change originally in this PR, I have filed https://issues.apache.org/jira/browse/FLINK-30200 and will followup later