flink
flink copied to clipboard
[FLINK-30613] Improve resolving schema compatibility -- Milestone one
What is the purpose of the change
This pr reverses the direction of resolving schema compatibility could improve the usability of schema evolution. See FLIP-263 for more details.
Brief change log
In the milestone one, we should:
- Add an extra method (TypeserializeSnapshotr#resolveSchemaCompatibility(TypeSerializerSnapshot<T> oldSerializerSnapshot)) in TypeSerializerSnapshot.java as above.
- Mark the original method as deprecated and it will use new method to resolve as default.
- Implement the new method for all built-in TypeserializerSnapshots.
Verifying this change
This change is already covered by existing tests, such as TypeSerializerUpgradeTestBase, PojoSerializerSnapshotTest.
Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): no
- The public API, i.e., is any changed class annotated with
@Public(Evolving)
: yes - The serializers: yes
- The runtime per-record code paths (performance sensitive): no
- Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
- The S3 file system connector: no
Documentation
- Does this pull request introduce a new feature? no
- If yes, how is the feature documented? docs
CI report:
- 5d774fdf50451923c5474c6ca1116c5f62a87c81 Azure: FAILURE
Bot commands
The @flinkbot bot supports the following commands:-
@flinkbot run azure
re-run the last Azure build
@flinkbot run azure
@gaoyunhaii @dawidwys Could you help to take a review ? I'd like to make it usable in 1.17. Thanks a lot!
I started the review, but since I started finding issues or potential issues in the migrated methods, would you be so kind and split migrating serializers/snapshots to separate commits? That would help me tremendously to do a proper review.
Thanks a lot for the review. I have splitted them as clear as possible.
@dawidwys Could you help to take a review again?
Sure, I will during my working days. Also to be clear it's definitely too late now for it to make it to 1.17. It's in a too important area to rush it in the last day of the development cycle, especially with so many open questions still.
Hey @masteryhx, do you plan to work on that for 1.18?
Hey @masteryhx, do you plan to work on that for 1.18?
Sure, I'd like to. Thanks a lot for the reminder. Big sorry for the delayed update due to my personal business last month. I will rework on the pr this month.
@flinkbot run azure
@dawidwys Hi, Sorry for the late update due to personal business. I have updated some previous commits so just also splitted them into different commits for the convinence of review. I will continue to update after your every review in the upcoming time.
BTW, The caller could call the new method instead of the old one currently, I will prepare a new pr for it after this pr.
@Zakelly @reswqa Could you also help to take a look ? Thanks a lot!
@flinkbot run azure
rebased. @Zakelly Could you also help to take a look ? Thanks a lot!
Thanks @masteryhx for the contribution!
I didn't finish the review, and left some minor comments. Please take a look in your free time, thanks~
Thanks a lot for your time! I have updated the pr by adding two commits. I will rebase some commits after the pr is approved. Please take a review again in your free time. Thanks a lot again!
Hi @masteryhx , thanks for the contribution and update!
I have reviewed this PR and left some comments. Please take a look in your free time, thanks~
Also,
The Schema Compatibility
related code is Flink’s infrastructure and is used by many components. I didn't change and read related code before this PR, so I'm not professional in this area. I have reviewed, but it would be great to have more committers who are familiar with this area to help review. It will be helpful for the bug-free.
Thanks a lot for the detailed review. We may have some key problems remaing in the comment to discuss. I'll update the pr after we are on the same way about these problems. Also ping @Zakelly @fredia here to help to review and discuss together, Thanks a lot for your time!
Hi @masteryhx , sorry for the late reply!
After reading your PR, I finally figured out why the default implementation of old method and new method should call each other. I think we could do something tricky to check whether one of the old and new methods has been implemented in specific TypeSerializerSnapshot
before it is called. https://stackoverflow.com/a/2315467 this may provide some ideas. WDYT?
Hi @masteryhx , sorry for the late reply!
After reading your PR, I finally figured out why the default implementation of old method and new method should call each other. I think we could do something tricky to check whether one of the old and new methods has been implemented in specific
TypeSerializerSnapshot
before it is called. https://stackoverflow.com/a/2315467 this may provide some ideas. WDYT?
Thanks for the advice. It's a good idea to avoid infinite loop when users don't implement both methods. I will update the pr to support this.
@flinkbot run azure
@1996fanrui @Zakelly Thanks for the review. I have updated remaining minor comments. The failed CI about python occurs from yesterday (see other failed cases) which is not related to this pr. @curcur Could you also help to take a review ? Thanks a lot!
Thanks all for the detailed review. Please let me know if any other comments. I will merge it if no other comments beyond two days from now on.