tiflow
tiflow copied to clipboard
codec(ticdc): fix incorrect encoding default "null" in Avro protocol
What problem does this PR solve?
Issue Number: close #11994
What is changed and how it works?
The goavro treats string literal "null" as null in the past. This is a bug of goavro and is fixed in v2.14.0. https://github.com/linkedin/goavro/issues/280
This PR upgrades the goavro dependency and outputs the correct type.
Before
{
"type": "record",
"name": "test",
"fields":
[
{
"type": [
"null",
"string"
],
"default": "null",
"name": "field"
}
]
}
After
{
"type": "record",
"name": "test",
"fields":
[
{
"type": [
"string",
"null"
],
"default": "null",
"name": "field"
}
]
}
Check List
Tests
- Unit test
- Integration test
Questions
Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?
Release note
According to the Avro documentation (1.11), when a default value is specified for a record field whose type is a union, the type of the default value must match the first element of the union.
Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all
@gemini-code-assist[bot]: adding LGTM is restricted to approvers and reviewers in OWNERS files.
In response to this:
Code Review
This pull request effectively addresses the issue of incorrect Avro encoding for fields with a default value of string "null". The core of the fix, updating the
goavrolibrary and consistently usinggoavro.NewCodecWithOptionswithEnableStringNull: false, has been applied correctly across all relevant parts of the codebase.Particularly commendable aspects:
- Dependency Update: Upgrading
github.com/linkedin/goavro/v2tov2.14.0correctly leverages the upstream fix.- Consistent Application: The
EnableStringNull: falseoption is consistently applied wherever Avro codecs are instantiated, including for Confluent Schema Registry, AWS Glue Schema Registry, and the simple marshaller.- Thorough Testing: The new test
TestStringNullinpkg/sink/codec/avro/confluent_schema_registry_test.gois excellent, as it directly verifies the behavior of thegoavrolibrary with and without the new option for the problematic schema. Additionally, modifying existing test schemas to include fields with"default": "null"further strengthens the validation of this fix.The code changes are clear, well-targeted, and significantly improve the correctness of Avro encoding in this specific scenario. No critical, high, or medium severity issues were found in the code changes themselves.
Summary of Findings
- Release Note: The release note section in the PR description is currently a placeholder. Since this PR fixes a bug (issue #11994), a release note is likely required. Please fill this in, or if a release note is not needed, replace the placeholder with
None. This was not added as a direct review comment due to severity settings.- Pull Request Description: The PR description is quite minimal. While the title and issue link are helpful, adding a brief summary of what was changed (e.g., mentioning the
goavrolibrary upgrade and the use ofNewCodecWithOptions) in the PR description itself would improve clarity for future readers and reviewers. This was not added as a direct review comment due to severity settings.Merge Readiness
The code changes in this pull request are of high quality and effectively address the reported bug. The tests are comprehensive and verify the fix. Before merging, please ensure the release note section in the pull request description is appropriately filled out or marked as
None. Once that's addressed, the PR should be ready for merge. As an AI, I am not authorized to approve pull requests; please ensure other reviewers approve this change.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.
Codecov Report
:white_check_mark: All modified and coverable lines are covered by tests.
:white_check_mark: Project coverage is 55.4981%. Comparing base (6171842) to head (0f26038).
:warning: Report is 19 commits behind head on master.
:white_check_mark: All tests successful. No failed tests found.
Additional details and impacted files
| Components | Coverage Δ | |
|---|---|---|
| cdc | 60.0203% <100.0000%> (+0.7446%) |
:arrow_up: |
| dm | 50.2954% <ø> (+0.1927%) |
:arrow_up: |
| engine | 53.2505% <ø> (+0.0451%) |
:arrow_up: |
| Flag | Coverage Δ | |
|---|---|---|
| unit | 55.4981% <100.0000%> (+0.4344%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
@@ Coverage Diff @@
## master #12191 +/- ##
================================================
+ Coverage 55.0637% 55.4981% +0.4344%
================================================
Files 1030 1030
Lines 143225 145084 +1859
================================================
+ Hits 78865 80519 +1654
- Misses 58563 58706 +143
- Partials 5797 5859 +62
:rocket: New features to boost your workflow:
- :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.
/retest
/retest
/test pull-cdc-integration-mysql-test
/test pull-cdc-integration-mysql-test
/test pull-cdc-integration-mysql-test
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: 3AceShowHand, asddongmen
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [3AceShowHand,asddongmen]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
[LGTM Timeline notifier]
Timeline:
2025-08-25 08:45:06.107498947 +0000 UTC m=+170597.985329902: :ballot_box_with_check: agreed by asddongmen.2025-08-25 08:45:16.49050054 +0000 UTC m=+170608.368331484: :ballot_box_with_check: agreed by 3AceShowHand.