paimon
paimon copied to clipboard
[Bug] first-row engine Batch Incremental count not equal changelogRecordCount
Search before asking
- [X] I searched in the issues and found nothing similar.
Paimon version
0.7
Compute Engine
flink
Minimal reproduce step
Between snapshot-12 and snapshot-14 has 408859167 changelog records,But batch Incremental count is 1545129599.
SELECT count(1) FROM my_catalog_paimon.db.table /*+ OPTIONS('incremental-between' = '12,14') */ limit 10;
And scan.file-creation result is correct.
SELECT count(1) FROM my_catalog_paimon.db.table /*+ OPTIONS('scan.file-creation-time-millis' = '1714119900000','scan.infer-parallelism.max'='500') */
snapshot-12
{
"version" : 3,
"id" : 12,
"schemaId" : 0,
"baseManifestList" : "manifest-list-a0cd3036-c7a9-4d97-9b3b-4fe117a2806e-2",
"deltaManifestList" : "manifest-list-a0cd3036-c7a9-4d97-9b3b-4fe117a2806e-3",
"changelogManifestList" : "manifest-list-a0cd3036-c7a9-4d97-9b3b-4fe117a2806e-4",
"commitUser" : "bb69057c-5a2b-4c8d-bf88-ebea7b373764",
"commitIdentifier" : 9223372036854775807,
"commitKind" : "COMPACT",
"timeMillis" : 1714115951112,
"logOffsets" : { },
"totalRecordCount" : 949445145099,
"deltaRecordCount" : -4538585,
"changelogRecordCount" : 42198489518,
"watermark" : -9223372036854775808
}
snapshot-14
{
"version" : 3,
"id" : 14,
"schemaId" : 0,
"baseManifestList" : "manifest-list-bf44c64a-80e6-410d-b43c-2d65b8c03823-2",
"deltaManifestList" : "manifest-list-bf44c64a-80e6-410d-b43c-2d65b8c03823-3",
"changelogManifestList" : "manifest-list-bf44c64a-80e6-410d-b43c-2d65b8c03823-4",
"commitUser" : "b457cb4b-2a32-4ca7-be21-f50328894121",
"commitIdentifier" : 9223372036854775807,
"commitKind" : "COMPACT",
"timeMillis" : 1714124210161,
"logOffsets" : { },
"totalRecordCount" : 949854004266,
"deltaRecordCount" : -1136270432,
"changelogRecordCount" : 408859167,
"watermark" : -9223372036854775808
}
What doesn't meet your expectations?
Batch Incremental count equal changelogRecordCount
Anything else?
CREATE TABLE my_catalog_paimon.db.table(
a bigint,
b bigint,
c string,
d string,
e string,
f string,
g string,
hstring,
PRIMARY KEY (a,b) NOT ENFORCED ) WITH (
'merge-engine'='first-row',
'changelog-producer' = 'lookup',
'bucket' = '5000'
);
Are you willing to submit a PR?
- [ ] I'm willing to submit a PR!
You can use incremental-between-scan-mode = changelog to fix this.
I will change the default behavior.