matrixone
matrixone copied to clipboard
[Bug]: CDC keep sending all old data as update without newly insert data.
Is there an existing issue for the same bug?
- [x] I have checked the existing issues.
Branch Name
main
Commit ID
27590cf
Other Environment Information
- Hardware parameters:
- OS type:
- Others:
Actual Behavior
something CDC only give me all UPDATE (old data) without new INSERT data and keep looping.
and no error found from log. except "wait too long" and "unexpect watermark".
Expected Behavior
No response
Steps to Reproduce
use the repo
https://github.com/cpegeric/matrixone/tree/cdc_sqlexecutor_cleanup
with branch cdc_sqlexecutor_cleanup
download the tool from
git clone [email protected]:cpegeric/wiki-benchmark.git
In MO,
> create database eric;
From command line,
% cd wiki-benchmark/python
% python indextest.py buildcdc 127.0.0.1 eric src hnswidx vector_l2_ops 128 1000000 hnsw
IN MO,
select count(*) from src;
LOG,
check the logs/stderr-xxx to see the logs.
Additional information
No response
clean start is always working. However, when using drop cdc, drop pitr and create cdc, create pitr. Issue mostly happens.
watermark每隔1s持久化,重启的时候会从上次持久化的watermark开始读,就会有重复的update。 第一批数据(i.e. cdc任务创建前的数据)的时候,watermark一直是0,重启后会重新读所有数据。 不重启的情况下发送重复数据还没复现。