matrixone icon indicating copy to clipboard operation
matrixone copied to clipboard

[Bug]: serverless instance CN oom

Open loveRhythm1990 opened this issue 2 years ago • 26 comments

Is there an existing issue for the same bug?

  • [X] I have checked the existing issues.

Environment

- Version or commit-id (e.g. v0.1.0 or 8b23a93): nightly-90dac68d
- Hardware parameters:
- OS type:
- Others:

命名空间:034348ea-9dfe-46e7-8f40-18dec1307c98

Actual Behavior

continuous oom

image image

Expected Behavior

No response

Steps to Reproduce

No response

Additional information

brief profile, detail profile can be view following doc: https://doc.weixin.qq.com/doc/w3_AW0A-gb6AOIAWdUX2NbSWevRb4vhF?scode=AJsA6gc3AA8B9Bz0x4AdoAqAZiAJU

image image

loveRhythm1990 avatar Nov 15 '23 03:11 loveRhythm1990

@aylei 处理一下,这个是因为memcache配置的太大导致的。目前的memcache无法接受如此大的cache size

nnsgmsone avatar Nov 15 '23 08:11 nnsgmsone

@nnsgmsone 已经调整成了 1Gi 的 memcache, 仍然会 OOM

aylei avatar Nov 20 '23 02:11 aylei

discussed with @nnsgmsone and set the severity to s0

aylei avatar Nov 20 '23 02:11 aylei

discussed with @nnsgmsone and set the severity to s0

Highest priority processing s-1

tianyahui-python avatar Nov 20 '23 02:11 tianyahui-python

继续处理分支中

nnsgmsone avatar Nov 21 '23 10:11 nnsgmsone

处理morpc导致的超时问题中,已经定位到一个bug。旭哥已经提交pr

nnsgmsone avatar Nov 22 '23 10:11 nnsgmsone

继续处理mem分支的morpc相关的问题中。

nnsgmsone avatar Nov 23 '23 10:11 nnsgmsone

经过测试eks和129都可以跑,https://github.com/matrixorigin/mo-auto-test/actions/runs/7015615926/job/19085534500。tke的stream closed在等复现,然后fix。

nnsgmsone avatar Nov 28 '23 10:11 nnsgmsone

定位分支tpcc性能下降的问题中。

nnsgmsone avatar Dec 01 '23 10:12 nnsgmsone

今天在修正#13219以及增加更多的metric中。

nnsgmsone avatar Dec 06 '23 10:12 nnsgmsone

no process

nnsgmsone avatar Dec 08 '23 10:12 nnsgmsone

no process

nnsgmsone avatar Dec 13 '23 10:12 nnsgmsone

wait https://github.com/matrixorigin/matrixone/issues/12532

nnsgmsone avatar Dec 25 '23 10:12 nnsgmsone

wait https://github.com/matrixorigin/matrixone/issues/12532

nnsgmsone avatar Dec 28 '23 10:12 nnsgmsone

正在和存储的同事协商https://github.com/matrixorigin/matrixone/issues/12532

nnsgmsone avatar Jan 03 '24 10:01 nnsgmsone

内存问题等待#12532

nnsgmsone avatar Jan 08 '24 10:01 nnsgmsone

内存问题等待#12532

nnsgmsone avatar Jan 12 '24 10:01 nnsgmsone

内存问题等待https://github.com/matrixorigin/matrixone/issues/12532

nnsgmsone avatar Jan 25 '24 10:01 nnsgmsone

内存问题等待https://github.com/matrixorigin/matrixone/issues/12532

nnsgmsone avatar Jan 30 '24 10:01 nnsgmsone

内存问题等待https://github.com/matrixorigin/matrixone/issues/12532

nnsgmsone avatar Feb 02 '24 10:02 nnsgmsone

no process

nnsgmsone avatar Feb 21 '24 13:02 nnsgmsone

no process

nnsgmsone avatar Feb 26 '24 10:02 nnsgmsone

等待pr合ru

nnsgmsone avatar Feb 29 '24 10:02 nnsgmsone

等待pr合ru

nnsgmsone avatar Mar 05 '24 10:03 nnsgmsone

等待pr和入

nnsgmsone avatar Mar 08 '24 10:03 nnsgmsone

处理事务泄露中

nnsgmsone avatar Mar 13 '24 10:03 nnsgmsone

no process

nnsgmsone avatar Mar 18 '24 11:03 nnsgmsone

no process

nnsgmsone avatar Mar 21 '24 10:03 nnsgmsone

等待pr和入后测试

nnsgmsone avatar Mar 26 '24 10:03 nnsgmsone

等待测试验证中

nnsgmsone avatar Apr 01 '24 10:04 nnsgmsone