matrixone icon indicating copy to clipboard operation
matrixone copied to clipboard

[Bug]: OOM when concurrent query

Open sukki37 opened this issue 3 years ago • 5 comments

Is there an existing issue for the same bug?

  • [X] I have checked the existing issues.

Environment

- Version or commit-id (e.g. v0.1.0 or 8b23a93):
- Hardware parameters:
- OS type:
- Others:

Actual Behavior

In a 32c64G VM, 2 query arrives mo-server concurrently, and the VM crashed because of OOM.

image

Expected Behavior

For now, MatrixOne does not support runtime memory limitation, the default memory quota of host/guest VM has been set to 1 << 40 (https://github.com/matrixorigin/matrixone/blob/main/cmd/db-server/main.go#L294)which is unreasonable. Maybe the server needs to be optimized to queue or refuse queries when there is not enough system resource.

Steps to Reproduce

No response

Additional information

No response

sukki37 avatar Feb 23 '22 09:02 sukki37

This requires frontend to set a memory limit, for a machine where the host memory limit is shared and then the guset memory limit is independent for each query. If set correctly then the oom problem can be avoided. @daviszhen

nnsgmsone avatar Feb 23 '22 10:02 nnsgmsone

Also frontend may need to do some simple queuing or scheduling of queries, otherwise too much concurrency may cause all queries to fail. Also, the current usage is wrong, as each session is a separate host memory limit and guest memory limit, and the values are set much higher than the memory limit.

nnsgmsone avatar Feb 23 '22 10:02 nnsgmsone

I get it. The problem will be delayed now. And It will be solved later 0.3 or earlier 0.4.

daviszhen avatar Feb 23 '22 11:02 daviszhen

In 0.6.0, this issue needs to be kept an eye on and fixed.

domingozhang avatar Jul 01 '22 09:07 domingozhang

Key insight is memory accounting.

fengttt avatar Jul 20 '22 07:07 fengttt

I need run this for several days

aressu1985 avatar Nov 08 '22 12:11 aressu1985

tonigth will run point_select for 12 hours

aressu1985 avatar Nov 09 '22 14:11 aressu1985