tiflow icon indicating copy to clipboard operation
tiflow copied to clipboard

TiCDC cannot use azblob:// and gcs:// cloud storage sink since v6.5.3 due to use of ObjPrefix

Open kennytm opened this issue 1 year ago • 2 comments

What did you do?

cdc cli changefeed create --sink-uri="azure://bucket/prefix?protocol=canal-json&enable-tidb-extension=true" -c cf

What did you expect to see?

Changefeed created and run normally

What did you see instead?

The changefeed failed with error code CDC:ErrProcessorUnknown with message azure storage not support ObjPrefix for now.

The log contained an error:

[2024/02/05 17:19:53.999 +00:00] [ERROR] [dml_worker.go:164] ["failed to write schema file to external storage"] [workerID=13] [namespace=default] [changefeed=cf] [error="azure storage not support ObjPrefix for now"] [errorVerbose="azure storage not support ObjPrefix for now\n
github.com/pingcap/tidb/br/pkg/storage.(*AzureBlobStorage).WalkDir\n
\tgithub.com/pingcap/[email protected]/br/pkg/storage/azblob.go:367\n
github.com/pingcap/tiflow/pkg/sink/cloudstorage.(*FilePathGenerator).CheckOrWriteSchema\n
\tgithub.com/pingcap/tiflow/pkg/sink/cloudstorage/path.go:203\n
github.com/pingcap/tiflow/cdc/sinkv2/eventsink/cloudstorage.(*dmlWorker).flushMessages\n
\tgithub.com/pingcap/tiflow/cdc/sinkv2/eventsink/cloudstorage/dml_worker.go:162\n
github.com/pingcap/tiflow/cdc/sinkv2/eventsink/cloudstorage.(*dmlWorker).run.func1\n
\tgithub.com/pingcap/tiflow/cdc/sinkv2/eventsink/cloudstorage/dml_worker.go:135\n
golang.org/x/sync/errgroup.(*Group).Go.func1\n
\tgolang.org/x/[email protected]/errgroup/errgroup.go:75\n
runtime.goexit\n\truntime/asm_amd64.s:1594"]

Versions of the cluster

Upstream TiDB cluster version (execute SELECT tidb_version(); in a MySQL client):

(irrelevant)

Upstream TiKV version (execute tikv-server --version):

(irrelevant)

TiCDC version (execute cdc version):

v6.5.6

kennytm avatar Feb 06 '24 17:02 kennytm

The problem was introduced by #8921 (since v6.5.3), which started to use the ObjPrefix option in WalkDir (it still existed as of v6.5.8).

https://github.com/pingcap/tiflow/blob/c86e7013411bcfd56f275ee671a79aabe3cdaac1/pkg/sink/cloudstorage/path.go#L203-L206

The ObjPrefix option was first introduced by pingcap/tidb#33409 since TiDB v6.1.0. This was supported on s3:// and local:// option, while on azblob:// and gcs:// using ObjPrefix will cause an "unsupported" error. Then, since TiDB v7.0.0 by pingcap/tidb#42050 they start to recognize ObjPrefix.

This means TiCDC in the version range [v6.5.3, v6.5.8] are unable to write to azblob:// and gcs://.

v6.6.0 seems not affected because #8881 was not cherry-picked to release-6.6. v7.x are not affected because TiDB "fixed" the error.

I think the ObjPrefix fix introduced by pingcap/tidb#42050 should be cherry-picked to TiDB's release-6.5 and -6.6. Alternatively, avoid using the ObjPrefix option, but that will degrade performance of S3 sink.

kennytm avatar Feb 06 '24 17:02 kennytm

/severity major

fubinzh avatar Feb 19 '24 02:02 fubinzh

closed by #10732

CharlesCheung96 avatar Apr 24 '24 03:04 CharlesCheung96

This issue only affects v6.5.

flowbehappy avatar Apr 26 '24 09:04 flowbehappy