swoole-src icon indicating copy to clipboard operation
swoole-src copied to clipboard

swoole 5.1.1 线上运行一段时间后,容易出现假死现象

Open bain2018 opened this issue 1 year ago • 1 comments

线上服务器使用的是 swoole+ redis 队列消费数据的模式,但运行一段时间后,出现了假死不工作的情况,消息还是在队列中堆积,重启就能正常 当前运行的swoole版本是 5.1.1 strace -o 片段如下: 3207058 futex(0x26a9298, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...> 3207053 futex(0x26a929c, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...> 3207052 futex(0x26a929c, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...> 3207049 futex(0x26a929c, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...> 3207046 futex(0x26a9298, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...> 3207044 futex(0x26a929c, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...> 3207042 futex(0x26a9298, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...> 3207041 futex(0x26a929c, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...> 3207040 futex(0x26a9298, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...> 3206939 epoll_wait(155, [{events=EPOLLIN, data={u32=43441472, u64=43441472}}], 4096, 225) = 1 3206939 recvfrom(195, "\10\0\0\0\0\0\0", 7, 0, NULL, NULL) = 7 3206939 epoll_ctl(155, EPOLL_CTL_DEL, 195, NULL) = 0 3206939 recvfrom(195, "\316", 1, 0, NULL, NULL) = 1 3206939 epoll_wait(155, [], 4096, 1) = 0 3206939 recvfrom(195, 0x7fb0d45ac1d8, 7, 0, NULL, NULL) = -1 EAGAIN (资源暂时不可用) 3206939 epoll_ctl(155, EPOLL_CTL_ADD, 195, {events=EPOLLIN, data={u32=43441472, u64=43441472}}) = 0 3206939 epoll_wait(155, [], 4096, 196) = 0 3206939 epoll_wait(155, [{events=EPOLLIN, data={u32=42676688, u64=42676688}}], 4096, 1000) = 1 3206939 recvfrom(196, "\10\0\0\0\0\0\0", 7, 0, NULL, NULL) = 7 3206939 epoll_ctl(155, EPOLL_CTL_DEL, 196, NULL) = 0 3206939 recvfrom(196, "\316", 1, 0, NULL, NULL) = 1 3206939 epoll_wait(155, [], 4096, 1) = 0 3206939 recvfrom(196, 0x7fb0ff081158, 7, 0, NULL, NULL) = -1 EAGAIN (资源暂时不可用) 3206939 epoll_ctl(155, EPOLL_CTL_ADD, 196, {events=EPOLLIN, data={u32=42676688, u64=42676688}}) = 0 3206939 epoll_wait(155, [], 4096, 15) = 0 3206939 epoll_wait(155, [{events=EPOLLIN, data={u32=43441472, u64=43441472}}], 4096, 640) = 1 3206939 recvfrom(195, "\10\0\0\0\0\0\0", 7, 0, NULL, NULL) = 7 3206939 epoll_ctl(155, EPOLL_CTL_DEL, 195, NULL) = 0 ……………… 3206939 recvfrom(195, "\10\0\0\0\0\0\0", 7, 0, NULL, NULL) = 7 3206939 epoll_ctl(155, EPOLL_CTL_DEL, 195, NULL) = 0 3206939 recvfrom(195, "\316", 1, 0, NULL, NULL) = 1 3206939 epoll_wait(155, [], 4096, 1) = 0 3206939 recvfrom(195, 0x7fb0fec641d8, 7, 0, NULL, NULL) = -1 EAGAIN (资源暂时不可用) 3206939 epoll_ctl(155, EPOLL_CTL_ADD, 195, {events=EPOLLIN, data={u32=43441472, u64=43441472}}) = 0 3206939 epoll_wait(155, [], 4096, 321) = 0 3206939 epoll_wait(155, [], 4096, 11) = 0 3206939 epoll_wait(155, [], 4096, 8) = 0 3206939 epoll_wait(155, [], 4096, 21) = 0 3206939 epoll_wait(155, [], 4096, 8) = 0 3206939 sendto(196, "\10\0\0\0\0\0\0\316", 8, 0, NULL, 0) = 8 3206939 epoll_wait(155, [], 4096, 2) = 0 3206939 sendto(195, "\10\0\0\0\0\0\0\316", 8, 0, NULL, 0) = 8 3206939 epoll_wait(155, [], 4096, 347) = 0 3206939 epoll_wait(155, [], 4096, 313) = 0 3206939 epoll_ctl(155, EPOLL_CTL_DEL, 159, NULL) = 0 3206939 epoll_wait(155, [], 4096, 1) = 0 3206939 recvfrom(159, 0x7fb0d4213018, 65535, 0, NULL, NULL) = -1 EAGAIN (资源暂时不可用) 3206939 epoll_ctl(155, EPOLL_CTL_ADD, 159, {events=EPOLLIN, data={u32=41372368, u64=41372368}}) = 0 3206939 epoll_wait(155, [{events=EPOLLIN, data={u32=42676688, u64=42676688}}], 4096, 686) = 1 3206939 recvfrom(196, "\10\0\0\0\0\0\0", 7, 0, NULL, NULL) = 7 3206939 epoll_ctl(155, EPOLL_CTL_DEL, 196, NULL) = 0 3206939 recvfrom(196, "\316", 1, 0, NULL, NULL) = 1 3206939 madvise(0x4cae000, 212992, MADV_DONTNEED) = 0 3206939 madvise(0x27db000, 61440, MADV_DONTNEED) = 0 3206939 madvise(0x43be000, 24576, MADV_DONTNEED) = 0 3206939 madvise(0x47be000, 12288, MADV_DONTNEED) = 0 3206939 madvise(0x4ad4000, 32768, MADV_DONTNEED) = 0 3206939 madvise(0x4b56000, 20480, MADV_DONTNEED) = 0 3206939 madvise(0x27ae000, 40960, MADV_DONTNEED) = 0 3206939 madvise(0x29eb000, 24576, MADV_DONTNEED) = 0 3206939 madvise(0x4b38000, 4096, MADV_DONTNEED) = 0 3206939 madvise(0x4b95000, 12288, MADV_DONTNEED) = 0 3206939 madvise(0x470c000, 49152, MADV_DONTNEED) = 0 3206939 madvise(0x498c000, 8192, MADV_DONTNEED) = 0 3206939 madvise(0x2a33000, 12288, MADV_DONTNEED) = 0 3206939 madvise(0x4bb1000, 176128, MADV_DONTNEED) = 0

由于是在线上运行,且日常不容易出现,出现异常的情况时,都需要尽快解决,基本上无从下手,只能先快速重启了

bain2018 avatar Jan 27 '24 01:01 bain2018

看起来像是锁等待,用的是swoole redis协程客户端吗

NathanFreeman avatar Feb 01 '24 00:02 NathanFreeman

应该是你的 php 代码中存在死循环

matyhtf avatar Feb 18 '24 04:02 matyhtf