paimon icon indicating copy to clipboard operation
paimon copied to clipboard

[common] [flink] Add support for complex types in kafka debezium avro cdc action.

Open umeshdangat opened this issue 1 year ago • 1 comments

Purpose

@zhuangchong has a an outstanding PR that allows support for debezium avro format in cdc action. That PR: #3323 The above patch allows consuming data from avro data from kafka into paimon but it doesnt support complex avro types. This PR achieves that. The original PR here was pointing to @zhuangchong branch to clearly show the changes only relavant to supporting complex avro types.

More details on the PR about some of the changes. @JingsongLi @zhuangchong could you please let me know what you think.

Note:

One issue is CdcSourceRecord contains Map<String, String> thus the current somewhat tedious approach is to deserialize avro complex types into json strings and then read them back from json strings rather than changing CdcSourceRecord Map<String, Object> to support value as Object. It would be a much larger change looking at the code changes needed.

I had to update the DataField to add a method for dataFieldEqualsIgnoreId which already existing in RichEventParser. For nested RowType fields this becomes necessary (coming from nested avro records) as when a DataField.type= RowType we cannot simply do equals on all data fields as they contain Id as well and it fails the equality, although there is no schema change.

Tests

API and Format

Documentation

umeshdangat avatar Aug 09 '24 22:08 umeshdangat

Thanks @umeshdangat for the contribution!

Can you rebase master? https://github.com/apache/paimon/pull/3323 has been merged.

JingsongLi avatar Aug 11 '24 13:08 JingsongLi

Closing this PR as the changes here are merged via #4246

umeshdangat avatar Oct 09 '24 15:10 umeshdangat