matrixone icon indicating copy to clipboard operation
matrixone copied to clipboard

[Bug]: The result of filter 'where column in(a,b,c) is not correct'.

Open Ariznawlll opened this issue 1 year ago • 7 comments

Is there an existing issue for the same bug?

  • [X] I have checked the existing issues.

Branch Name

main

Commit ID

3e36439cf387571c341d8dc971f47acb9cc12dde

Other Environment Information

- Hardware parameters:
- OS type:
- Others:

Actual Behavior

Actual result: actual_result.csv Expect result: expect_result.csv

log:test_bykey.log

Expected Behavior

No response

Steps to Reproduce

CREATE TABLE `table_with_pk_index_for_load_100m` (
  `id` BIGINT NOT NULL,
  `col1` TINYINT DEFAULT NULL,
  `col2` SMALLINT DEFAULT NULL,
  `col3` INT DEFAULT NULL,
  `col4` BIGINT DEFAULT NULL,
  `col5` TINYINT UNSIGNED DEFAULT NULL,
  `col6` SMALLINT UNSIGNED DEFAULT NULL,
  `col7` INT UNSIGNED DEFAULT NULL,
  `col8` BIGINT UNSIGNED DEFAULT NULL,
  `col9` FLOAT DEFAULT NULL,
  `col10` DOUBLE DEFAULT NULL,
  `col11` VARCHAR(255) DEFAULT NULL,
  `col12` DATE DEFAULT NULL,
  `col13` DATETIME DEFAULT NULL,
  `col14` TIMESTAMP NULL DEFAULT NULL,
  `col15` BOOL DEFAULT NULL,
  `col16` DECIMAL(16,6) DEFAULT NULL,
  `col17` TEXT DEFAULT NULL,
  `col18` JSON DEFAULT NULL,
  `col19` BLOB DEFAULT NULL,
  `col20` BINARY(255) DEFAULT NULL,
  `col21` VARBINARY(255) DEFAULT NULL,
  `col22` VECF32(3) DEFAULT NULL,
  `col23` VECF32(3) DEFAULT NULL,
  `col24` VECF64(3) DEFAULT NULL,
  `col25` VECF64(3) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `col4` (`col4`),
  KEY `col3` (`col3`)
);

load data url s3option {'endpoint'='http://cos.ap-guangzhou.myqcloud.com','access_key_id'='','secret_access_key'='','bucket'='mo-load-guangzhou-1308875761','filepath'='mo-big-data/1000000000_20_columns_load_data_pk.csv'} into table big_data_test.table_with_pk_index_for_load_100M fields terminated by '|' lines terminated by '\n' ignore 1 lines parallel 'true';
(have already sent load sql to @aunjgr)

select id, col1, col2, col3, col4, col10, col11, col14, col16, col17 from big_data_test.table_with_pk_index_for_load_100M where id in (1,10000,100000,1000000,5000000,11000000,21000000,41000000,61000000,81000000) or
                                                                    id in (2,20000,200000,2000000,5100000,12000000,22000000,42000000,62000000,82000000) or
                                                                    id in (3,30000,300000,3000000,5200000,13000000,23000000,43000000,63000000,83000000) or
                                                                    id in (4,40000,400000,4000000,5300000,14000000,24000000,44000000,64000000,84000000) or
                                                                    id in (5,50000,500000,5000000,5400000,15000000,25000000,45000000,65000000,85000000) or
                                                                    id in (6,60000,600000,6000000,5500000,16000000,26000000,46000000,66000000,86000000) or
                                                                    id in (7,70000,700000,7000000,5600000,17000000,27000000,47000000,67000000,87000000) or
                                                                    id in (8,80000,800000,8000000,5700000,18000000,28000000,48000000,68000000,88000000) or
                                                                    id in (9,90000,900000,9000000,5800000,19000000,29000000,49000000,69000000,89000000) or
                                                                    id in (10,100001,1000001,10000001,5900000,20000001,30000000,50000000,70000000,90000000)
                                                                    order by id;

Additional information

No response

Ariznawlll avatar Jun 18 '24 11:06 Ariznawlll

1.2-dev也有这个问题 commit-id:9a2c756c899b6755fc1997c05e3cfec61a89f36a

log: test_bykey_1.log

测试步骤与issue里面的描述相同。

Ariznawlll avatar Jun 18 '24 14:06 Ariznawlll

二分结果: image 出问题的commit:82a61fea6807e7f821157c26d9dd5a73a0a33dc9

Ariznawlll avatar Jun 18 '24 15:06 Ariznawlll

@gouhongshen 用这个简单的case就可以复现:

create table t1(a int primary key);
insert into t1 (select * from generate_series(1,1000000,1)g);
select * from t1 where a in (1,2) or a in (10001,10002) or a in (20001,20002) or a in (40001,40002);

结果里只有前两个in起了作用。

aunjgr avatar Jun 19 '24 02:06 aunjgr

测试步骤:

CREATE TABLE `table_with_pk_index_for_load_100m` (
  `id` BIGINT NOT NULL,
  `col1` TINYINT DEFAULT NULL,
  `col2` SMALLINT DEFAULT NULL,
  `col3` INT DEFAULT NULL,
  `col4` BIGINT DEFAULT NULL,
  `col5` TINYINT UNSIGNED DEFAULT NULL,
  `col6` SMALLINT UNSIGNED DEFAULT NULL,
  `col7` INT UNSIGNED DEFAULT NULL,
  `col8` BIGINT UNSIGNED DEFAULT NULL,
  `col9` FLOAT DEFAULT NULL,
  `col10` DOUBLE DEFAULT NULL,
  `col11` VARCHAR(255) DEFAULT NULL,
  `col12` DATE DEFAULT NULL,
  `col13` DATETIME DEFAULT NULL,
  `col14` TIMESTAMP NULL DEFAULT NULL,
  `col15` BOOL DEFAULT NULL,
  `col16` DECIMAL(16,6) DEFAULT NULL,
  `col17` TEXT DEFAULT NULL,
  `col18` JSON DEFAULT NULL,
  `col19` BLOB DEFAULT NULL,
  `col20` BINARY(255) DEFAULT NULL,
  `col21` VARBINARY(255) DEFAULT NULL,
  `col22` VECF32(3) DEFAULT NULL,
  `col23` VECF32(3) DEFAULT NULL,
  `col24` VECF64(3) DEFAULT NULL,
  `col25` VECF64(3) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `col4` (`col4`),
  KEY `col3` (`col3`)
);

load data url s3option {'endpoint'='http://cos.ap-guangzhou.myqcloud.com','access_key_id'='','secret_access_key'='','bucket'='mo-load-guangzhou-1308875761','filepath'='mo-big-data/1000000000_20_columns_load_data_pk.csv'} into table big_data_test.table_with_pk_index_for_load_100M fields terminated by '|' lines terminated by '\n' ignore 1 lines parallel 'true';

select count(*) from big_data_test.table_with_pk_index_for_load_100M where id in (1,10000,100000,1000000,5000000,11000000,21000000,41000000,61000000,81000000) or
                                                                    id in (2,20000,200000,2000000,5100000,12000000,22000000,42000000,62000000,82000000) or
                                                                    id in (3,30000,300000,3000000,5200000,13000000,23000000,43000000,63000000,83000000) or
                                                                    id in (4,40000,400000,4000000,5300000,14000000,24000000,44000000,64000000,84000000) or
                                                                    id in (5,50000,500000,5000000,5400000,15000000,25000000,45000000,65000000,85000000) or
                                                                    id in (6,60000,600000,6000000,5500000,16000000,26000000,46000000,66000000,86000000) or
                                                                    id in (7,70000,700000,7000000,5600000,17000000,27000000,47000000,67000000,87000000) or
                                                                    id in (8,80000,800000,8000000,5700000,18000000,28000000,48000000,68000000,88000000) or
                                                                    id in (9,90000,900000,9000000,5800000,19000000,29000000,49000000,69000000,89000000) or
                                                                    id in (10,100001,1000001,10000001,5900000,20000001,30000000,50000000,70000000,90000000)
                                                                    order by id;

main commit:c6fa50c4f9f83cf5a76cf09a239f1be79c0646c5 image

1.2 commit: 4df8e561b3a6f36ab519d19f69d55956bab90955 image

结论:main与1.2-dev测试均通过.

Ariznawlll avatar Jun 21 '24 03:06 Ariznawlll

Wrong result is automatically s-1

fengttt avatar Jul 01 '24 07:07 fengttt

https://github.com/matrixorigin/matrixone/issues/17200#issuecomment-2203196673

zengyan1 avatar Jul 02 '24 13:07 zengyan1

https://github.com/matrixorigin/matrixone/issues/17200#issuecomment-2206084934

zengyan1 avatar Jul 03 '24 13:07 zengyan1

https://github.com/matrixorigin/matrixone/issues/17200#issuecomment-2208500850

zengyan1 avatar Jul 04 '24 12:07 zengyan1

https://github.com/matrixorigin/matrixone/issues/17200#issuecomment-2210613559

zengyan1 avatar Jul 05 '24 10:07 zengyan1

该issue可以close

zengyan1 avatar Jul 08 '24 08:07 zengyan1