tiflow icon indicating copy to clipboard operation
tiflow copied to clipboard

Kafka/Avro: truncated columns not handled correctly

Open dveeden opened this issue 2 years ago • 2 comments

What did you do?

Setup

TiUP Playground with TiCDC

tiup playground --ticdc 1 --tiflash 0 --without-monitor nightly

Kafka

export CONFLUENT_HOME=/home/dvaneeden/confluent-7.3.1
confluent local services start

Changefeed

tiup cdc:nightly cli changefeed create \
--sink-uri="kafka://127.0.0.1:9092/test?protocol=avro" \
--schema-registry="http://127.0.0.1:8081"

Consumer

cd $CONFLUENT_HOME
./bin/kafka-avro-console-consumer --topic test \
--bootstrap-server 127.0.0.1:9092 -\
-property schema.registry.url=http://127.0.0.1:8081 \
--from-beginning

Table

CREATE TABLE t1 (id INT PRIMARY KEY AUTO_INCREMENT, c1 CHAR(5));
INSERT INTO t1(c1) VALUES ("foo"),("bar"),("baz");

Test

sql> SET sql_mode='';
Query OK, 0 rows affected (0.0003 sec)

sql> ALTER TABLE t1 MODIFY COLUMN c1 CHAR(2);
Query OK, 0 rows affected, 1 warning (0.7475 sec)
Warning (code 1265): 3 warnings with this error code, first warning: Data truncated for column 'c1', value is 'foo'

sql> TABLE t1;
+----+----+
| id | c1 |
+----+----+
|  1 | fo |
|  2 | ba |
|  3 | ba |
+----+----+
3 rows in set (0.0011 sec)

And from the consumer:

{"id":1,"c1":{"string":"foo"}}
{"id":2,"c1":{"string":"bar"}}
{"id":3,"c1":{"string":"baz"}}

What did you expect to see?

The same truncated data in the table and in the consumer

What did you see instead?

Table data is truncated, but the data in the consumer is not

Versions of the cluster

Upstream TiDB cluster version (execute SELECT tidb_version(); in a MySQL client):

Release Version: v6.6.0-alpha
Edition: Community
Git Commit Hash: dbce2cbb51a070eb2c164adc8d4f52d6a755275f
Git Branch: heads/refs/tags/v6.6.0-alpha
UTC Build Time: 2023-01-16 14:33:21
GoVersion: go1.19.3
Race Enabled: false
TiKV Min Version: 6.2.0-alpha
Check Table Before Drop: false
Store: tikv

Upstream TiKV version (execute tikv-server --version):

TiKV 
Release Version:   6.6.0-alpha
Edition:           Community
Git Commit Hash:   a3c15ce27d582dc695848bffb363631f4cae2db5
Git Commit Branch: heads/refs/tags/v6.6.0-alpha
UTC Build Time:    2023-01-16 14:34:09
Rust Version:      rustc 1.67.0-nightly (96ddd32c4 2022-11-14)
Enable Features:   pprof-fp jemalloc mem-profiling portable sse test-engine-kv-rocksdb test-engine-raft-raft-engine cloud-aws cloud-gcp cloud-azure
Profile:           dist_release

TiCDC version (execute cdc version):

Release Version: v6.6.0-alpha
Git Commit Hash: 6e1088adf38e92e2a165954636603dc1ff479ee5
Git Branch: heads/refs/tags/v6.6.0-alpha
UTC Build Time: 2023-01-16 14:25:01
Go Version: go version go1.19.3 linux/amd64
Failpoint Build: false

dveeden avatar Jan 17 '23 10:01 dveeden

Related to #8095

dveeden avatar Jan 17 '23 10:01 dveeden

/assign @hi-rustin

nongfushanquan avatar Jun 13 '23 07:06 nongfushanquan

It is a design behavior. CDC does not replicate any data changes caused by lossy DDL operations. Check this issue https://github.com/pingcap/tiflow/issues/8686.

flowbehappy avatar Apr 23 '24 08:04 flowbehappy