matrixone
matrixone copied to clipboard
[Bug]: logservice lost connection due to memory constraints, swap
Is there an existing issue for the same bug?
- [X] I have checked the existing issues.
Environment
- Version or commit-id (e.g. v0.1.0 or 8b23a93): 6a65b65cecb3
- Hardware parameters:
- OS type:
- Others:
Actual Behavior
Run the following test, docker compose with 2 cn. On a machine with relatively small memory (16G), it will swap and cause timeout due to memory resource.
Expected Behavior
No response
Steps to Reproduce
ftian;
drop table if exists t;
create table t (i int, j int);
insert into t values (1, 1), (2, 2), (3, 3), (4, 4), (5, null), (null, 5);
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
select count(*) from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
select count(*) from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
select count(*) from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
select count(*) from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
select count(*) from t;
delete from t where i = 1;
delete from t where i = 2;
select count(*) from t;
insert into t select * from t;
select count(*) from t;
Additional information
This is used to repro #9447
XuPeng did a memory profiling, it is mainly in S3Writer.WriteS3Batch. Quick look at the code shows the code need major rewrite.
no fix.
no fix.
What does this mean ... You don't know how to fix or you say we should not fix?
no fix.
What does this mean ... You don't know how to fix or you say we should not fix?
我还没有空看这个issue.
用当前最新的main分支 f7a8c81b9568be75bf3257d5e4375ed82a479546 进行测试。 insert均可以较快完成,delete from语句迟迟无法返回结果,且此时cpu占用很低,约为0.3 core。 需要跟一下delete流程中发生了什么问题。
还在做别的issue.
没有进行相关工作
没有进行相关工作。1.2没时间安排这个,这个先放backlog。
近期不会安排该工作。
与上一个评论相同
同上一个评论
在改别的s-1中
在改connection reset的s-1中
无投入。
今天没跟踪这个问题。