EasyLogger
EasyLogger copied to clipboard
多线程使用一段时间后出现死锁
arm linux环境
(gdb) bt
#0 0x0000007f90ea9064 in __lll_lock_wait () from target:/lib64/libpthread.so.0
#1 0x0000007f90ea1a4c in pthread_mutex_lock ()
from target:/lib64/libpthread.so.0
#2 0x00000000004cad80 in elog_port_output_lock ()
at ./3rd_lib/easylogger/port/elog_port.c:76
#3 0x00000000004cbf78 in elog_get_filter_tag_lvl (tag=0x651c40 "TASK")
at ./3rd_lib/easylogger/src/elog.c:435
#4 0x00000000004cc2e8 in elog_output (level=2 '\002', tag=0x651c40 "TASK",
file=0x651968 "./devices/devices.c",
func=0x651f68 <__FUNCTION__.12424> "timer1_handle", line=217,
format=0x651c18 "timming update_devices_status %d")
at ./3rd_lib/easylogger/src/elog.c:528
#5 0x00000000004e9314 in timer1_handle (sig=10) at ./devices/devices.c:217
#6 <signal handler called>
#7 0x0000007f90d0f6d8 in nanosleep () from target:/lib64/libc.so.6
#8 0x0000007f90d0f568 in sleep () from target:/lib64/libc.so.6
#9 0x00000000004e75ec in main (argc=1, argv=0x7fea78d518) at ./main.c:98
(gdb) thread 2
[Switching to thread 2 (Thread 3106.3107)]
#0 0x0000007f90ea9034 in __lll_lock_wait () from target:/lib64/libpthread.so.0
(gdb) bt
#0 0x0000007f90ea9034 in __lll_lock_wait () from target:/lib64/libpthread.so.0
#1 0x0000007f90ea1a4c in pthread_mutex_lock ()
from target:/lib64/libpthread.so.0
#2 0x00000000004cad80 in elog_port_output_lock ()
at ./3rd_lib/easylogger/port/elog_port.c:76
#3 0x00000000004cbf78 in elog_get_filter_tag_lvl (tag=0x651c40 "TASK")
at ./3rd_lib/easylogger/src/elog.c:435
#4 0x00000000004cc2e8 in elog_output (level=2 '\002', tag=0x651c40 "TASK",
file=0x651968 "./devices/devices.c",
func=0x651f68 <__FUNCTION__.12424> "timer1_handle", line=226,
format=0x651c48 "timming check_devices_status %d")
at ./3rd_lib/easylogger/src/elog.c:528
#5 0x00000000004e93b4 in timer1_handle (sig=10) at ./devices/devices.c:226
#6 <signal handler called>
#7 0x0000007f90d2e6e8 in write () from target:/lib64/libc.so.6
#8 0x0000007f90cddeb0 in _IO_file_write () from target:/lib64/libc.so.6
#9 0x0000007f90cdd2f8 in ?? () from target:/lib64/libc.so.6
#10 0x0000007f90cde648 in _IO_file_xsputn () from target:/lib64/libc.so.6
#11 0x0000007f90cb8e90 in ?? () from target:/lib64/libc.so.6
#12 0x0000007f90cb6824 in vfprintf () from target:/lib64/libc.so.6
#13 0x0000007f90cbd948 in printf () from target:/lib64/libc.so.6
#14 0x00000000004cad54 in elog_port_output (
log=0x7065b8 <poll_get_buf> "D/HEX JP_PLC: 0000-0017: 55 ...***... 0A"..., size=127)
at ./3rd_lib/easylogger/port/elog_port.c:65
#15 0x00000000004cde18 in async_output (arg=0x0)
at ./3rd_lib/easylogger/src/elog_async.c:299
#16 0x0000007f90e9f0e8 in start_thread () from target:/lib64/libpthread.so.0
#17 0x0000007f90d3af4c in ?? () from target:/lib64/libc.so.6
(gdb) thread 3
[Switching to thread 3 (Thread 3106.3108)]
#0 0x0000007f90ea9064 in __lll_lock_wait () from target:/lib64/libpthread.so.0
(gdb) bt
#0 0x0000007f90ea9064 in __lll_lock_wait () from target:/lib64/libpthread.so.0
#1 0x0000007f90ea1a4c in pthread_mutex_lock () from target:/lib64/libpthread.so.0
#2 0x00000000004cad80 in elog_port_output_lock ()
at ./3rd_lib/easylogger/port/elog_port.c:76
#3 0x00000000004cbf78 in elog_get_filter_tag_lvl (tag=0x651c40 "TASK")
at ./3rd_lib/easylogger/src/elog.c:435
#4 0x00000000004cc2e8 in elog_output (level=2 '\002', tag=0x651c40 "TASK",
file=0x651968 "./devices/devices.c",
func=0x651f68 <__FUNCTION__.12424> "timer1_handle", line=226,
format=0x651c48 "timming check_devices_status %d")
at ./3rd_lib/easylogger/src/elog.c:528
#5 0x00000000004e93b4 in timer1_handle (sig=10) at ./devices/devices.c:226
#6 <signal handler called>
#7 0x0000007f90d0f6d8 in nanosleep () from target:/lib64/libc.so.6
#8 0x0000007f90d34d08 in usleep () from target:/lib64/libc.so.6
#9 0x00000000004e6bf4 in http_deal (arg=0x0) at ./service/http.c:415
#10 0x0000007f90e9f0e8 in start_thread () from target:/lib64/libpthread.so.0
#11 0x0000007f90d3af4c in ?? () from target:/lib64/libc.so.6
能否描述下具体的现象,有使用 linux 自带的 demo 做测试吗?
就是使用linux demo移植到arm linux板子上使用的,easylogger部分没改过东西,gdb调试的时候有一共11个线程,另外还有一个定时器调用。使用的时候只有在一定的压力测试下几个钟后才出现。
https://github.com/armink/EasyLogger/blob/master/demo/os/linux/easylogger/port/elog_port.c#L77
试着在这里加些记录信息,记录下上次是哪个线程 成功调用 的,该线程状态如何
我也遇到了同样的问题
https://github.com/armink/EasyLogger/blob/master/demo/os/linux/easylogger/port/elog_port.c#L77
试着在这里加些记录信息,记录下上次是哪个线程 成功调用 的,该线程状态如何
如果上次调用的线程调用完cancel掉了,会有问题吗?
https://github.com/armink/EasyLogger/blob/master/demo/os/linux/easylogger/port/elog_port.c#L77 试着在这里加些记录信息,记录下上次是哪个线程 成功调用 的,该线程状态如何
如果上次调用的线程调用完cancel掉了,会有问题吗?
线程是如何 cancel 的?是正常退出,还是强制?
https://github.com/armink/EasyLogger/blob/master/demo/os/linux/easylogger/port/elog_port.c#L77 试着在这里加些记录信息,记录下上次是哪个线程 成功调用 的,该线程状态如何
如果上次调用的线程调用完cancel掉了,会有问题吗?
线程是如何 cancel 的?是正常退出,还是强制?
用的pthread_cancel。这个会有影响吗?难道是进到lock里面退出,后面没有unlock吗?
有可能的,其他线程直接 cancel 另外线程挺不安全的。这块建议使用通知的方式,通知到当前线程,线程自行 return 退出
我试一下吧,谢谢
我也是在linux环境,出现了死锁。 发现是我自己移植的问题。demo下有一个easyloger和根目录下有easyloger文件夹,两个容易搞混。用demo中的目录覆盖根目录中的easyloger,就没发现问题了。
关注一下代码里有没有用信号量,如果锁没有出去,被信号量打断就会死锁,其次就是多线程不要随意cancel 掉,最好线程起来后就一直跑
On Sep 22, 2020, at 13:16, SmartElec [email protected] wrote:
我也是在linux环境,出现了死锁。
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
关注一下代码里有没有用信号量,如果锁没有出去,被信号量打断就会死锁,其次就是多线程不要随意cancel 掉,最好线程起来后就一直跑 … On Sep 22, 2020, at 13:16, SmartElec @.***> wrote: 我也是在linux环境,出现了死锁。 — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
你的问题已经解决了吗。我重新移植了暂时还没出问题
定时器与线程同时使用有可能会出现死锁。参考:https://clodfisher.github.io/2018/10/AlarmAndPthread/
定时器与线程同时使用有可能会出现死锁。参考:https://clodfisher.github.io/2018/10/AlarmAndPthread/
是的,easylogger输出时有锁,当线程加锁后定时器抢占到加锁线程运行就死锁了。