freeswitch
freeswitch copied to clipboard
FreeSwitch crashes ,Program terminated with signal SIGFPE, Arithmetic exception
Hello,
FreeSwitch offen crashs every one or two days ,and the problem has lasted for about 2 months, the OS is centos 7.4,the FS version is 1.10.7 , the sofia-sip is latest version 1.13.7 ,and rebuilt last week,proc architecture is : 12 Intel(R) Xeon(R) Bronze 3204 CPU @ 1.90GHz。
the call flow is a simple inbound call to IVR , and then to callcenter according to digit dialed , no any call flows are involving localhost in setup , no local resolver used.
file /etc/resolv.conf is empty.
file /etc/hosts : 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
the error messages in /var/log/messages is : May 4 20:34:52 localhost kernel: traps: freeswitch[92647] trap divide error ip:7f2c9e58381a sp:7f2c7d58c070 error:0 in libsofia-sip-ua.so.0.6.0[7f2c9e457000+1ca000] May 4 20:34:54 localhost freeswitch: #33[m#033[32m2022-05-04 20:34:34.034071 98.97% [INFO] switch_cpp.cpp:1465 From Lua callcnt_say_agent_num.lua: caller_uuid: f May 4 20:34:54 localhost systemd: freeswitch.service: main process exited, code=killed, status=8/FPE May 4 20:34:54 localhost systemd: Unit freeswitch.service entered failed state. May 4 20:34:54 localhost systemd: freeswitch.service failed.
and the gdb bt info is : Core was generated by `/usr/local/fs1.10.7/bin/freeswitch -nonat'. Program terminated with signal SIGFPE, Arithmetic exception. #0 0x00007f2c9e58381a in su_block_add (b=0x7f2c542282f0, p=0x7f2c542547f0) at su_alloc.c:342 342 h = (size_t)((uintptr_t)p % b->sub_n); [Current thread is 1 (LWP 92647)] (gdb) (gdb) bt #0 0x00007f2c9e58381a in su_block_add (b=0x7f2c542282f0, p=0x7f2c542547f0) at su_alloc.c:342 #1 0x00007f2c9e583dcf in sub_alloc (home=0x7f2c54250ef0, sub=0x7f2c542282f0, size=80, zero=do_calloc) at su_alloc.c:530 #2 0x00007f2c9e5858e3 in su_zalloc (home=0x7f2c54250ef0, size=80) at su_alloc.c:1577 #3 0x00007f2c9e50dd4b in outgoing_make_a_aaaa_query (orq=0x7f2c54254690) at nta.c:10463 #4 0x00007f2c9e50d350 in outgoing_resolve_next (orq=0x7f2c54254690) at nta.c:10244 #5 0x00007f2c9e50ee88 in outgoing_answer_srv (orq=0x7f2c54254690, q=0x7f2c5410b2c0, answers=0x7f2c541167b0) at nta.c:10830 #6 0x00007f2c9e5782c8 in sres_query_report_error (q=0x7f2c5410b2c0, answers=0x7f2c541167b0) at sres.c:2991 #7 0x00007f2c9e578675 in sres_resend_dns_query (res=0x7f2c54002990, q=0x7f2c5410b2c0, timeout=0) at sres.c:3091 #8 0x00007f2c9e5795c2 in sres_resolver_report_error (res=0x7f2c54002990, socket=590, errcode=111, remote=0x7f2c7d58c560, remotelen=16, info=0x7f2c7d58c4e0 "icmp type=3 code=3 reported by 127.0.0.1") at sres.c:3440 #9 0x00007f2c9e579281 in sres_resolver_error (res=0x7f2c54002990, socket=590) at sres.c:3354 #10 0x00007f2c9e57ef20 in sres_sofia_poll (magic=0x7f2c68001180, w=0x7f2c54001084, reg=0x7f2c540037c0) at sresolv.c:357 #11 0x00007f2c9e59079a in su_epoll_port_wait_events (self=0x7f2c540008c0, tout=690) at su_epoll_port.c:510 #12 0x00007f2c9e58cb9a in su_base_port_run (self=0x7f2c540008c0) at su_base_port.c:349 #13 0x00007f2c9e588e4a in su_port_run (self=0x7f2c540008c0) at su_port.h:326 #14 0x00007f2c9e589f24 in su_root_run (self=0x7f2c54001130) at su_root.c:819 #15 0x00007f2c9e58d956 in su_pthread_port_clone_main (varg=0x7f2c91233460) at su_pthread_port.c:343 #16 0x00007f2c9d5b9e65 in start_thread () from /lib64/libpthread.so.0 #17 0x00007f2c9cc0e88d in __libc_ifunc_impl_list () from /lib64/libc.so.6 #18 0x0000000000000000 in ?? () (gdb)
would you please tell me how to capture the messages , or messages which are involving localhost in setup , and how to check whether local resolver is set.
please help ,thank you! backtrace1107.log
can you try this PR ? https://github.com/freeswitch/sofia-sip/pull/130 . you must rebuild and reinstall sofia-sip. then rebuild FS. the bug looks related to the code resolving ipv6 / AAAA
thank you ,I try it .
hello
I have downloaded and rebuilt the sofia-sip and freeswitch from freeswitch/sofia-sip#130, but FS still crashed after 18 hours running ,error message is chaned from FPE to SIGABRT.
Core was generated by `/usr/local/fs1.10.7/bin/freeswitch -nonat'. Program terminated with signal SIGABRT, Aborted. #0 0x00007f9a3713a337 in ssignal () from /lib64/libc.so.6 [Current thread is 1 (LWP 152931)] (gdb) (gdb) (gdb) (gdb) bt #0 0x00007f9a3713a337 in ssignal () from /lib64/libc.so.6 #1 0x00007f9a3713ba28 in abort () from /lib64/libc.so.6 #2 0x00007f9a3717ce87 in __libc_message () from /lib64/libc.so.6 #3 0x00007f9a37185679 in _int_free () from /lib64/libc.so.6 #4 0x00007f9a38b77c06 in sub_alloc (home=0x7f99ec42fa40, sub=0x7f99ec34e6e0, size=80, zero=do_calloc) at su_alloc.c:478 #5 0x00007f9a38b798e3 in su_zalloc (home=0x7f99ec42fa40, size=80) at su_alloc.c:1577 #6 0x00007f9a38b01d4b in outgoing_make_a_aaaa_query (orq=0x7f99ec430dc0) at nta.c:10463 #7 0x00007f9a38b01350 in outgoing_resolve_next (orq=0x7f99ec430dc0) at nta.c:10244 #8 0x00007f9a38b02e88 in outgoing_answer_srv (orq=0x7f99ec430dc0, q=0x7f99ec430fe0, answers=0x7f99ec356780) at nta.c:10830 #9 0x00007f9a38b6c2c8 in sres_query_report_error (q=0x7f99ec430fe0, answers=0x7f99ec356780) at sres.c:2991 #10 0x00007f9a38b6c675 in sres_resend_dns_query (res=0x7f99ec002990, q=0x7f99ec430fe0, timeout=0) at sres.c:3091 #11 0x00007f9a38b6d5c2 in sres_resolver_report_error (res=0x7f99ec002990, socket=229, errcode=111, remote=0x7f9a22062560, remotelen=16, info=0x7f9a220624e0 "icmp type=3 code=3 reported by 127.0.0.1") at sres.c:3440 #12 0x00007f9a38b6d281 in sres_resolver_error (res=0x7f99ec002990, socket=229) at sres.c:3354 #13 0x00007f9a38b72f20 in sres_sofia_poll (magic=0x7f9a00001180, w=0x7f99ec20d494, reg=0x7f99ec0037c0) at sresolv.c:357 #14 0x00007f9a38b8479a in su_epoll_port_wait_events (self=0x7f99ec0008c0, tout=695) at su_epoll_port.c:510 #15 0x00007f9a38b80b9a in su_base_port_run (self=0x7f99ec0008c0) at su_base_port.c:349 #16 0x00007f9a38b7ce4a in su_port_run (self=0x7f99ec0008c0) at su_port.h:326 #17 0x00007f9a38b7df24 in su_root_run (self=0x7f99ec001130) at su_root.c:819 #18 0x00007f9a38b81956 in su_pthread_port_clone_main (varg=0x7f9a237d6450) at su_pthread_port.c:343 #19 0x00007f9a37bade65 in start_thread () from /lib64/libpthread.so.0 #20 0x00007f9a3720288d in __libc_ifunc_impl_list () from /lib64/libc.so.6 #21 0x0000000000000000 in ?? () (gdb)
please help,thank you! backtrace20220521.log
I pushed another commit and rebased the branch, so you'll get more fixes. please test.