[Bug]: The result of filter 'where column in(a,b,c) is not correct'.
Is there an existing issue for the same bug?
- [X] I have checked the existing issues.
Branch Name
main
Commit ID
3e36439cf387571c341d8dc971f47acb9cc12dde
Other Environment Information
- Hardware parameters:
- OS type:
- Others:
Actual Behavior
Actual result: actual_result.csv Expect result: expect_result.csv
log:test_bykey.log
Expected Behavior
No response
Steps to Reproduce
CREATE TABLE `table_with_pk_index_for_load_100m` (
`id` BIGINT NOT NULL,
`col1` TINYINT DEFAULT NULL,
`col2` SMALLINT DEFAULT NULL,
`col3` INT DEFAULT NULL,
`col4` BIGINT DEFAULT NULL,
`col5` TINYINT UNSIGNED DEFAULT NULL,
`col6` SMALLINT UNSIGNED DEFAULT NULL,
`col7` INT UNSIGNED DEFAULT NULL,
`col8` BIGINT UNSIGNED DEFAULT NULL,
`col9` FLOAT DEFAULT NULL,
`col10` DOUBLE DEFAULT NULL,
`col11` VARCHAR(255) DEFAULT NULL,
`col12` DATE DEFAULT NULL,
`col13` DATETIME DEFAULT NULL,
`col14` TIMESTAMP NULL DEFAULT NULL,
`col15` BOOL DEFAULT NULL,
`col16` DECIMAL(16,6) DEFAULT NULL,
`col17` TEXT DEFAULT NULL,
`col18` JSON DEFAULT NULL,
`col19` BLOB DEFAULT NULL,
`col20` BINARY(255) DEFAULT NULL,
`col21` VARBINARY(255) DEFAULT NULL,
`col22` VECF32(3) DEFAULT NULL,
`col23` VECF32(3) DEFAULT NULL,
`col24` VECF64(3) DEFAULT NULL,
`col25` VECF64(3) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `col4` (`col4`),
KEY `col3` (`col3`)
);
load data url s3option {'endpoint'='http://cos.ap-guangzhou.myqcloud.com','access_key_id'='','secret_access_key'='','bucket'='mo-load-guangzhou-1308875761','filepath'='mo-big-data/1000000000_20_columns_load_data_pk.csv'} into table big_data_test.table_with_pk_index_for_load_100M fields terminated by '|' lines terminated by '\n' ignore 1 lines parallel 'true';
(have already sent load sql to @aunjgr)
select id, col1, col2, col3, col4, col10, col11, col14, col16, col17 from big_data_test.table_with_pk_index_for_load_100M where id in (1,10000,100000,1000000,5000000,11000000,21000000,41000000,61000000,81000000) or
id in (2,20000,200000,2000000,5100000,12000000,22000000,42000000,62000000,82000000) or
id in (3,30000,300000,3000000,5200000,13000000,23000000,43000000,63000000,83000000) or
id in (4,40000,400000,4000000,5300000,14000000,24000000,44000000,64000000,84000000) or
id in (5,50000,500000,5000000,5400000,15000000,25000000,45000000,65000000,85000000) or
id in (6,60000,600000,6000000,5500000,16000000,26000000,46000000,66000000,86000000) or
id in (7,70000,700000,7000000,5600000,17000000,27000000,47000000,67000000,87000000) or
id in (8,80000,800000,8000000,5700000,18000000,28000000,48000000,68000000,88000000) or
id in (9,90000,900000,9000000,5800000,19000000,29000000,49000000,69000000,89000000) or
id in (10,100001,1000001,10000001,5900000,20000001,30000000,50000000,70000000,90000000)
order by id;
Additional information
No response
1.2-dev也有这个问题 commit-id:9a2c756c899b6755fc1997c05e3cfec61a89f36a
log: test_bykey_1.log
测试步骤与issue里面的描述相同。
二分结果:
出问题的commit:82a61fea6807e7f821157c26d9dd5a73a0a33dc9
@gouhongshen 用这个简单的case就可以复现:
create table t1(a int primary key);
insert into t1 (select * from generate_series(1,1000000,1)g);
select * from t1 where a in (1,2) or a in (10001,10002) or a in (20001,20002) or a in (40001,40002);
结果里只有前两个in起了作用。
测试步骤:
CREATE TABLE `table_with_pk_index_for_load_100m` (
`id` BIGINT NOT NULL,
`col1` TINYINT DEFAULT NULL,
`col2` SMALLINT DEFAULT NULL,
`col3` INT DEFAULT NULL,
`col4` BIGINT DEFAULT NULL,
`col5` TINYINT UNSIGNED DEFAULT NULL,
`col6` SMALLINT UNSIGNED DEFAULT NULL,
`col7` INT UNSIGNED DEFAULT NULL,
`col8` BIGINT UNSIGNED DEFAULT NULL,
`col9` FLOAT DEFAULT NULL,
`col10` DOUBLE DEFAULT NULL,
`col11` VARCHAR(255) DEFAULT NULL,
`col12` DATE DEFAULT NULL,
`col13` DATETIME DEFAULT NULL,
`col14` TIMESTAMP NULL DEFAULT NULL,
`col15` BOOL DEFAULT NULL,
`col16` DECIMAL(16,6) DEFAULT NULL,
`col17` TEXT DEFAULT NULL,
`col18` JSON DEFAULT NULL,
`col19` BLOB DEFAULT NULL,
`col20` BINARY(255) DEFAULT NULL,
`col21` VARBINARY(255) DEFAULT NULL,
`col22` VECF32(3) DEFAULT NULL,
`col23` VECF32(3) DEFAULT NULL,
`col24` VECF64(3) DEFAULT NULL,
`col25` VECF64(3) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `col4` (`col4`),
KEY `col3` (`col3`)
);
load data url s3option {'endpoint'='http://cos.ap-guangzhou.myqcloud.com','access_key_id'='','secret_access_key'='','bucket'='mo-load-guangzhou-1308875761','filepath'='mo-big-data/1000000000_20_columns_load_data_pk.csv'} into table big_data_test.table_with_pk_index_for_load_100M fields terminated by '|' lines terminated by '\n' ignore 1 lines parallel 'true';
select count(*) from big_data_test.table_with_pk_index_for_load_100M where id in (1,10000,100000,1000000,5000000,11000000,21000000,41000000,61000000,81000000) or
id in (2,20000,200000,2000000,5100000,12000000,22000000,42000000,62000000,82000000) or
id in (3,30000,300000,3000000,5200000,13000000,23000000,43000000,63000000,83000000) or
id in (4,40000,400000,4000000,5300000,14000000,24000000,44000000,64000000,84000000) or
id in (5,50000,500000,5000000,5400000,15000000,25000000,45000000,65000000,85000000) or
id in (6,60000,600000,6000000,5500000,16000000,26000000,46000000,66000000,86000000) or
id in (7,70000,700000,7000000,5600000,17000000,27000000,47000000,67000000,87000000) or
id in (8,80000,800000,8000000,5700000,18000000,28000000,48000000,68000000,88000000) or
id in (9,90000,900000,9000000,5800000,19000000,29000000,49000000,69000000,89000000) or
id in (10,100001,1000001,10000001,5900000,20000001,30000000,50000000,70000000,90000000)
order by id;
main commit:c6fa50c4f9f83cf5a76cf09a239f1be79c0646c5
1.2 commit: 4df8e561b3a6f36ab519d19f69d55956bab90955
结论:main与1.2-dev测试均通过.
Wrong result is automatically s-1
https://github.com/matrixorigin/matrixone/issues/17200#issuecomment-2203196673
https://github.com/matrixorigin/matrixone/issues/17200#issuecomment-2206084934
https://github.com/matrixorigin/matrixone/issues/17200#issuecomment-2208500850
https://github.com/matrixorigin/matrixone/issues/17200#issuecomment-2210613559
该issue可以close