predixy icon indicating copy to clipboard operation
predixy copied to clipboard

predixy crash

Open beebol opened this issue 4 years ago • 10 comments

predixy version: predixy-1.0.5-pre 操作系统版本:CentOS Linux release 7.4.1708 (Core)

2020-12-01 11:36:52.793291 E Backtrace.h:18 predixy backtrace(11)
/usr/bin/predixy(_Z9traceInfoi+0x12f)[0x4b3d3f]
/usr/bin/predixy[0x4b2b1d]
/lib64/libc.so.6(+0x35270)[0x7f5a79dc6270]
/usr/bin/predixy(_ZNK7Request6isDoneEv+0x0)[0x486c00]
/usr/bin/predixy(_ZN16AcceptConnection4sendEP7HandlerP7RequestP8Response+0x22)[0x450e72]
/usr/bin/predixy(_ZN7Handler14handleResponseEP17ConnectConnectionP7RequestP8Response+0x25c)[0x4a9c6c]
/usr/bin/predixy(_ZN7Handler14directResponseEP7RequestN8Response11GenericCodeEP17ConnectConnection+0x12a)[0x4aa8ba]
/usr/bin/predixy(_ZN17ConnectConnection5closeEP7Handler+0xa9)[0x45b2a9]
/usr/bin/predixy(_ZN7Handler26postConnectConnectionEventEv+0x1ac)[0x4a60fc]
/usr/bin/predixy(_ZN7Handler3runEv+0x108)[0x4a8bf8]
/usr/bin/predixy(execute_native_thread_routine+0x20)[0x4f4b60]
/lib64/libpthread.so.0(+0x7e25)[0x7f5a7a673e25]
/lib64/libc.so.6(clone+0x6d)[0x7f5a79e8934d]
2020-12-01 11:36:52.794259 N Handler.cpp:212 h 2 remove c 10.4x.xx.x:31008 162 with status 4 EventError

beebol avatar Dec 03 '20 08:12 beebol

请问这个问题解决了吗

CJLUzjj avatar Jul 02 '21 03:07 CJLUzjj

没有解决

beebol avatar Aug 03 '21 02:08 beebol

我也遇到这个问题了。暂时无解, 另外redis上容器之后,你有没有遇到过 maxmemory 设置的10g,pod mem limit 设置的15G,运行一段时间之后,pod 内存涨到15G 被k8s kill 的情况呢。我们这边在大规模的使用容器redis,上千个实例,遇到好多坑, 可以交流一下 ,fuzengjie 我的微信号

fuzengjie avatar Sep 11 '21 02:09 fuzengjie

我初步定位到了问题的所在,本质上是mget在某些情况下,导致Handle中持有AccecptConnection的野指针问题导致的crash,解决办法很简单,Request::isDone()这个函数中,把case Command::Mget:这行注释掉即可。带来的影响在于mget回包会等到后端都返回后一起回包,个人认为不会影响多少性能,而且这个最初设计会存在数据不一致的问题。

CJLUzjj avatar Sep 13 '21 06:09 CJLUzjj

@CJLUzjj 你注释掉后,就正常了吗?我们也遇到这个coredump了。 @fortrue @joyield 这个bug有修复计划吗?

panyfx avatar Sep 13 '21 08:09 panyfx

@panyfx 我不保证一定正常,作者大佬好像已经不维护了

CJLUzjj avatar Sep 13 '21 08:09 CJLUzjj

后来你们修复了吗?我现在遇到这个问题,运行状态下会 crash

IT-xiaoge avatar May 20 '22 02:05 IT-xiaoge

this is not reproducible for us on our local and testing env. But this is happening on our prod env. Can you suggest how can we reproduce this and fix it

aniketverma-zomato avatar Jun 30 '22 05:06 aniketverma-zomato

We have identified that the issue is caused by a "use after free" condition in the AcceptConnection function. After the previous connection is destructed and its related resources are released, a new client connection is created. At this point, it is possible for the previously freed memory to be reused. Subsequently, when receiving a response from Redis and continuing to call the send() method, it references the previously released memory.

Here is a reproducible scenario:

  1. Set up a situation where the backend times out.
  2. Send an mget request to the proxy.
  3. Disconnect the client before receiving the response.
  4. Immediately reconnect to the proxy without performing any additional actions.
  5. crash.

maochongxin avatar Jan 25 '24 09:01 maochongxin

这个问题已经确认并得到修复,大家可以使用分支fix/issue124来验证。

@CJLUzjj 提供的方法也可以避免这个问题的发生

我初步定位到了问题的所在,本质上是mget在某些情况下,导致Handle中持有AccecptConnection的野指针问题导致的crash,解决办法很简单,Request::isDone()这个函数中,把case Command::Mget:这行注释掉即可。带来的影响在于mget回包会等到后端都返回后一起回包,个人认为不会影响多少性能,而且这个最初设计会存在数据不一致的问题。

fortrue avatar Jan 25 '24 14:01 fortrue