tiflow icon indicating copy to clipboard operation
tiflow copied to clipboard

codec(ticdc): fix incorrect encoding default "null" in Avro protocol

Open wk989898 opened this issue 6 months ago • 9 comments

What problem does this PR solve?

Issue Number: close #11994

What is changed and how it works?

The goavro treats string literal "null" as null in the past. This is a bug of goavro and is fixed in v2.14.0. https://github.com/linkedin/goavro/issues/280

This PR upgrades the goavro dependency and outputs the correct type.

Before

{
		"type": "record",
		"name": "test",
		"fields":
		  [
			{
			 "type": [
				 "null",
				 "string"
			 ],
			 "default": "null",
			 "name": "field"
			}
		   ]
}

After

{
		"type": "record",
		"name": "test",
		"fields":
		  [
			{
			 "type": [
				 "string",
				 "null"
			 ],
			 "default": "null",
			 "name": "field"
			}
		   ]
}

Check List

Tests

  • Unit test
  • Integration test

Questions

Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?

Release note

According to the Avro documentation (1.11), when a default value is specified for a record field whose type is a union, the type of the default value must match the first element of the union. 

wk989898 avatar Jun 09 '25 06:06 wk989898

Skipping CI for Draft Pull Request. If you want CI signal for your change, please convert it to an actual PR. You can still manually trigger a test run with /test all

ti-chi-bot[bot] avatar Jun 09 '25 06:06 ti-chi-bot[bot]

@gemini-code-assist[bot]: adding LGTM is restricted to approvers and reviewers in OWNERS files.

In response to this:

Code Review

This pull request effectively addresses the issue of incorrect Avro encoding for fields with a default value of string "null". The core of the fix, updating the goavro library and consistently using goavro.NewCodecWithOptions with EnableStringNull: false, has been applied correctly across all relevant parts of the codebase.

Particularly commendable aspects:

  • Dependency Update: Upgrading github.com/linkedin/goavro/v2 to v2.14.0 correctly leverages the upstream fix.
  • Consistent Application: The EnableStringNull: false option is consistently applied wherever Avro codecs are instantiated, including for Confluent Schema Registry, AWS Glue Schema Registry, and the simple marshaller.
  • Thorough Testing: The new test TestStringNull in pkg/sink/codec/avro/confluent_schema_registry_test.go is excellent, as it directly verifies the behavior of the goavro library with and without the new option for the problematic schema. Additionally, modifying existing test schemas to include fields with "default": "null" further strengthens the validation of this fix.

The code changes are clear, well-targeted, and significantly improve the correctness of Avro encoding in this specific scenario. No critical, high, or medium severity issues were found in the code changes themselves.

Summary of Findings

  • Release Note: The release note section in the PR description is currently a placeholder. Since this PR fixes a bug (issue #11994), a release note is likely required. Please fill this in, or if a release note is not needed, replace the placeholder with None. This was not added as a direct review comment due to severity settings.
  • Pull Request Description: The PR description is quite minimal. While the title and issue link are helpful, adding a brief summary of what was changed (e.g., mentioning the goavro library upgrade and the use of NewCodecWithOptions) in the PR description itself would improve clarity for future readers and reviewers. This was not added as a direct review comment due to severity settings.

Merge Readiness

The code changes in this pull request are of high quality and effectively address the reported bug. The tests are comprehensive and verify the fix. Before merging, please ensure the release note section in the pull request description is appropriately filled out or marked as None. Once that's addressed, the PR should be ready for merge. As an AI, I am not authorized to approve pull requests; please ensure other reviewers approve this change.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

ti-chi-bot[bot] avatar Jun 09 '25 06:06 ti-chi-bot[bot]

Codecov Report

:white_check_mark: All modified and coverable lines are covered by tests. :white_check_mark: Project coverage is 55.4981%. Comparing base (6171842) to head (0f26038). :warning: Report is 19 commits behind head on master. :white_check_mark: All tests successful. No failed tests found.

Additional details and impacted files
Components Coverage Δ
cdc 60.0203% <100.0000%> (+0.7446%) :arrow_up:
dm 50.2954% <ø> (+0.1927%) :arrow_up:
engine 53.2505% <ø> (+0.0451%) :arrow_up:
Flag Coverage Δ
unit 55.4981% <100.0000%> (+0.4344%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

@@               Coverage Diff                @@
##             master     #12191        +/-   ##
================================================
+ Coverage   55.0637%   55.4981%   +0.4344%     
================================================
  Files          1030       1030                
  Lines        143225     145084      +1859     
================================================
+ Hits          78865      80519      +1654     
- Misses        58563      58706       +143     
- Partials       5797       5859        +62     
:rocket: New features to boost your workflow:
  • :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

codecov[bot] avatar Jun 09 '25 06:06 codecov[bot]

/retest

wk989898 avatar Jun 11 '25 02:06 wk989898

/retest

purelind avatar Jun 11 '25 09:06 purelind

/test pull-cdc-integration-mysql-test

wk989898 avatar Jun 11 '25 10:06 wk989898

/test pull-cdc-integration-mysql-test

wk989898 avatar Jun 11 '25 14:06 wk989898

/test pull-cdc-integration-mysql-test

wk989898 avatar Jun 13 '25 10:06 wk989898

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: 3AceShowHand, asddongmen

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • ~~OWNERS~~ [3AceShowHand,asddongmen]

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

ti-chi-bot[bot] avatar Aug 25 '25 08:08 ti-chi-bot[bot]

[LGTM Timeline notifier]

Timeline:

  • 2025-08-25 08:45:06.107498947 +0000 UTC m=+170597.985329902: :ballot_box_with_check: agreed by asddongmen.
  • 2025-08-25 08:45:16.49050054 +0000 UTC m=+170608.368331484: :ballot_box_with_check: agreed by 3AceShowHand.

ti-chi-bot[bot] avatar Aug 25 '25 08:08 ti-chi-bot[bot]