twemproxy icon indicating copy to clipboard operation
twemproxy copied to clipboard

OOM on the mbuf freelist (100+GB)

Open inter169 opened this issue 7 years ago • 4 comments

the nutcracker process consumed 100+GB phisycal memory on my production box after a data migration from another redis to this one (nutcracker). and the gdb console showed below:

(gdb) p nfree_mbufq
$1 = 6365614
(gdb) p mbuf_chunk_size
$1 = 16384

the memory consumption was nfree_mbufq * mbuf_chunk_size = 101GB approx. I have read some code fixes (pr)s about the similar phenomenon, like: https://github.com/twitter/twemproxy/pull/461 https://github.com/twitter/twemproxy/issues/203

but such fixes didn't set the limitation of the mbuf chunks, so the OOM was still here, I coded a fix, and the nutcracker can pass a command param ('-n ', in my fix) to set the max number of mbuf chunks, once exceeded the limitation it can free the mbuf after processing one req immediately.

inter169 avatar Oct 02 '17 03:10 inter169

Hi, can you share your fix. I am facing the same issue.

gauxs avatar Feb 17 '19 07:02 gauxs

I mergerd the code fix onto my repo: https://github.com/inter169/twemproxy/commit/2b2a0d0cfdbb18d77b510a86283943eafa798d94

inter169 avatar Feb 23 '19 04:02 inter169

@inter169 Memory can not be freed after even free all mbufs but free msgs together in my test. I think the reason is syscall brk in malloc can not free memory fragment to os.

yongman avatar Apr 25 '19 03:04 yongman

https://github.com/twitter/twemproxy/pull/486 was merged in 2016 so 461 isn't relevant.

I'd agree that some sort of memory limit would make sense, and that mbufs aren't freed

During pathological networking situations (e.g. limited bandwidth, extremely high timeouts and high traffic) high memory usage can be a problem

TysonAndre avatar Jun 16 '21 20:06 TysonAndre