matrixone icon indicating copy to clipboard operation
matrixone copied to clipboard

[Bug]: delete from table panic.

Open Ariznawlll opened this issue 1 year ago • 7 comments

Is there an existing issue for the same bug?

  • [X] I have checked the existing issues.

Branch Name

main

Commit ID

39ba16c

Other Environment Information

- Hardware parameters:
- OS type:
- Others:

Actual Behavior

tke环境:

image

log:https://grafana.ci.matrixorigin.cn/explore?panes=%7B%22iwM%22:%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bnamespace%3D%5C%22mo-big-data-20241015%5C%22%7D%20%7C%3D%20%60panic%60%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%221729072196000%22,%22to%22:%221729072436000%22%7D%7D%7D&schemaVersion=1&orgId=1

Expected Behavior

No response

Steps to Reproduce

分布式环境执行:

create database test;
use test;
create table t(a int);
insert into t values (1),(2);
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
select count(*) from t;
delete from t where a=1;

Additional information

No response

Ariznawlll avatar Oct 16 '24 09:10 Ariznawlll

@jensenojs @ouyuanning 二位看下这个问题,检查下之前的优化方案是否还存在问题

aressu1985 avatar Oct 18 '24 02:10 aressu1985

锦赛看一下为什么没有转表锁

ouyuanning avatar Oct 18 '24 02:10 ouyuanning

mysql> delete from t where a=1;
ERROR 20101 (HY000): internal error: panic too large fixed slice 33554432, max is 16777216: 
github.com/matrixorigin/matrixone/pkg/lockservice.(*fixedSlicePool).acquire
        /go/src/github.com/matrixorigin/matrixone/pkg/lockservice/slice.go:219
github.com/matrixorigin/matrixone/pkg/lockservice.(*cowSlice).append
        /go/src/github.com/matrixorigin/matrixone/pkg/lockservice/slice.go:65
github.com/matrixorigin/matrixone/pkg/lockservice.(*activeTxn).lockAdded
        /go/src/github.com/matrixorigin/matrixone/pkg/lockservice/txn.go:121
github.c
mysql> 
mysql> 
mysql> 
mysql> explain verbose delete from t where a=1;
+-------------------------------------------------------------------------------------------------------------+
| TP QUERY PLAN                                                                                               |
+-------------------------------------------------------------------------------------------------------------+
| Delete[4] on test.t (cost=4096.00 outcnt=4096.00 selectivity=0.5000 blockNum=2)                             |
|   ->  Lock[3] (cost=4096.00 outcnt=4096.00 selectivity=0.5000 blockNum=2)                                   |
|         Lock level: Row level lock                                                                          |
|         ->  Project[2] (cost=4096.00 outcnt=4096.00 selectivity=0.5000 blockNum=2)                          |
|               Output: a, __mo_fake_pk_col, __mo_rowid                                                       |
|               ->  Project[1] (cost=4096.00 outcnt=4096.00 selectivity=0.5000 blockNum=2)                    |
|                     Output: t.a, t.__mo_fake_pk_col, t.__mo_rowid                                           |
|                     ->  Table Scan[0] on test.t (cost=8192.00 outcnt=4096.00 selectivity=0.5000 blockNum=2) |
|                           Output: t.a, t.__mo_fake_pk_col, t.__mo_rowid                                     |
|                           Table: 't' (0:'a', 1:'__mo_fake_pk_col', 2:'__mo_rowid')                          |
|                           Filter Cond: (t.a = 1)                                                            |
+-------------------------------------------------------------------------------------------------------------+
11 rows in set (0.00 sec)

mysql> delete from t where a=1;
ERROR 20101 (HY000): internal error: panic too large fixed slice 33554432, max is 16777216: 
github.com/matrixorigin/matrixone/pkg/lockservice.(*fixedSlicePool).acquire
        /go/src/github.com/matrixorigin/matrixone/pkg/lockservice/slice.go:219
github.com/matrixorigin/matrixone/pkg/lockservice.(*cowSlice).append
        /go/src/github.com/matrixorigin/matrixone/pkg/lockservice/slice.go:65
github.com/matrixorigin/matrixone/pkg/lockservice.(*activeTxn).lockAdded
        /go/src/github.com/matrixorigin/matrixone/pkg/lockservice/txn.go:121
github.c

gouhongshen avatar Oct 19 '24 09:10 gouhongshen

我在本地的mac, 深圳pc, tke三套环境上测试了upstream/main, 以及39ba16c5a的代码, 还没有成功复现这个issue. 第一次在tke上跑, 跑出来了一个别的问题, 除此之外还没有跑出来fail的情况 Pasted Graphic 7

向QA同学请教了一下, 她跑了三次, 一次panic 一次oom 一次r-w conflict, 不一样的错误信息

红深哥好像很容易碰到这个问题, 能看到确实是没有及时地把行锁转换为表锁

jensenojs avatar Oct 19 '24 09:10 jensenojs

本地 mac, 129, binary search 按照上面的步骤均能稳定复现,默认配置。

不过,修改 lock slice max 的参数可以跳过该 panic

	MaxFixedSliceSize toml.ByteSize `toml:"max-fixed-slice-size"`

gouhongshen avatar Oct 19 '24 09:10 gouhongshen

我和下面这个issue一起做了

  • https://github.com/matrixorigin/MO-Cloud/issues/4116

jensenojs avatar Oct 21 '24 03:10 jensenojs

insert into table select from table大数据量也比较容易出现 commit:72b1061 mysql> insert into test01(b) select b from test01 where a<200000000; ERROR 20101 (HY000): internal error: panic too large fixed slice 33554432, max is 16777216: github.com/matrixorigin/matrixone/pkg/lockservice.(*fixedSlicePool).acquire /go/src/github.com/matrixorigin/matrixone/pkg/lockservice/slice.go:219 github.com/matrixorigin/matrixone/pkg/lockservice.(*cowSlice).append /go/src/github.com/matrixorigin/matrixone/pkg/lockservice/slice.go:65 github.com/matrixorigin/matrixone/pkg/lockservice.(*activeTxn).lockAdded /go/src/github.com/matrixorigin/matrixone/pkg/lockservice/txn.go:121 github.c 企业微信截图_c9cf9930-e890-48da-aca6-b24fbf2df3f4

heni02 avatar Oct 23 '24 06:10 heni02

当前2.0版本,目前只能保持这种状态

aressu1985 avatar Oct 24 '24 02:10 aressu1985

这个pr能根治这种panic

  • https://github.com/matrixorigin/matrixone/pull/19567

jensenojs avatar Oct 24 '24 05:10 jensenojs

testing

Ariznawlll avatar Oct 25 '24 06:10 Ariznawlll

分布式环境测试:

场景一:执行了三次,第三次报错r-w conflict

mysql> select git_version();
+---------------+
| git_version() |
+---------------+
| d6f4467       |
+---------------+
1 row in set (0.00 sec)

create database test;
use test;
create table t(a int);
insert into t values (1),(2);
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
select count(*) from t;
delete from t where a=1;
image

场景二:执行了三次,均成功

create table  if not exists big_data_test.table_with_com_pk_index_for_load_100M( id bigint auto_increment, col1 tinyint, col2 smallint, col3 int, col4 bigint, col5 tinyint unsigned, col6 smallint unsigned, col7 int unsigned, col8 bigint unsigned, col9 float, col10 double, col11 varchar(255), col12 Date, col13 DateTime, col14 timestamp, col15 bool, col16 decimal(16,6), col17 text, col18 json, col19 blob, col20 binary(255), col21 varbinary(255), col22 vecf32(3), col23 vecf32(3), col24 vecf64(3), col25 vecf64(3), primary key (id, col1), key(col3,col4), unique key(col4) );

load data url s3option {'endpoint'='http://cos.ap-guangzhou.myqcloud.com','access_key_id'='***','secret_access_key'='***','bucket'='mo-load-guangzhou-1308875761','filepath'='mo-big-data/100000000_20_columns_load_data_pk.csv'} into table big_data_test.table_with_com_pk_index_for_load_100M fields terminated by '|' lines terminated by '\n' ignore 1 lines parallel 'true';

mysql> show create table big_data_test.table_with_com_pk_index_for_load_100M;
+---------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table                                 | Create Table                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
+---------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| table_with_com_pk_index_for_load_100m | CREATE TABLE `table_with_com_pk_index_for_load_100m` (
  `id` bigint NOT NULL AUTO_INCREMENT,
  `col1` tinyint NOT NULL,
  `col2` smallint DEFAULT NULL,
  `col3` int DEFAULT NULL,
  `col4` bigint DEFAULT NULL,
  `col5` tinyint unsigned DEFAULT NULL,
  `col6` smallint unsigned DEFAULT NULL,
  `col7` int unsigned DEFAULT NULL,
  `col8` bigint unsigned DEFAULT NULL,
  `col9` float DEFAULT NULL,
  `col10` double DEFAULT NULL,
  `col11` varchar(255) DEFAULT NULL,
  `col12` date DEFAULT NULL,
  `col13` datetime DEFAULT NULL,
  `col14` timestamp NULL DEFAULT NULL,
  `col15` bool DEFAULT NULL,
  `col16` decimal(16,6) DEFAULT NULL,
  `col17` text DEFAULT NULL,
  `col18` json DEFAULT NULL,
  `col19` blob DEFAULT NULL,
  `col20` binary(255) DEFAULT NULL,
  `col21` varbinary(255) DEFAULT NULL,
  `col22` vecf32(3) DEFAULT NULL,
  `col23` vecf32(3) DEFAULT NULL,
  `col24` vecf64(3) DEFAULT NULL,
  `col25` vecf64(3) DEFAULT NULL,
  PRIMARY KEY (`id`,`col1`),
  UNIQUE KEY `col4` (`col4`),
  KEY `col3` (`col3`,`col4`)
) |
+---------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> select count(*) from big_data_test.table_with_com_pk_index_for_load_100M;
+-----------+
| count(*)  |
+-----------+
| 100000000 |
+-----------+
1 row in set (0.05 sec)

mysql> create table test01 (col1 smallint DEFAULT NULL);
Query OK, 0 rows affected (0.15 sec)

mysql> insert into test01 select(col2) from big_data_test.table_with_com_pk_index_for_load_100M where col1 < 100000000;
Query OK, 100000000 rows affected (28.34 sec)
image

结论:测试过程均未出现panic

Ariznawlll avatar Oct 25 '24 06:10 Ariznawlll