freeswitch
freeswitch copied to clipboard
Inbound calls not cleared after ORIGINATOR_CANCEL
For a while we've had sporadic inbound calls that stay in the “RINGING“ state. Even after days they are not cleared. I don't have a reliable way to reproduce this, but it seems related to canceling the request, see below.
As with #960, they show up for show calls
like any other ringing call but running uuid_kill
for such a channel (ID) yields only -ERR No such channel!
. Other calls can be killed.
Here is the SIP dialogue for one of these calls. Half a second after we send 180/Ringing, the other side cancels the request.
No. Time Source Destination Protocol Length Info
1 2021-11-18 05:08:20.766911 THEIR.IP OUR.IP SIP/SDP 1362 Request: INVITE sip:+SIPADDR |
2 2021-11-18 05:08:20.857954 OUR.IP THEIR.IP SIP 521 Status: 100 Trying |
3 2021-11-18 05:08:20.941829 OUR.IP THEIR.IP SIP 1189 Status: 180 Ringing |
4 2021-11-18 05:08:21.490194 THEIR.IP OUR.IP SIP 507 Request: CANCEL sip:+SIPADDR |
5 2021-11-18 05:08:21.579186 OUR.IP THEIR.IP SIP 466 Status: 200 OK |
6 2021-11-18 05:08:21.579192 OUR.IP THEIR.IP SIP 998 Status: 487 Request Terminated |
7 2021-11-18 05:08:21.579236 THEIR.IP OUR.IP SIP 519 Request: ACK SIPADDR |
Interestingly, our telco partner reported these calls as “no answer“. Both the cancel request and the “no answer“ reporting seem to consistently happen for those not-cleaned up calls. So maybe this is a concurrency issue: while we've got the call parked and try to figure out what to do with it (via event socket) the request is canceled and something is not cleaned up.
We were on 1.10.6 and I had hoped that the early media fix for 1.10.7 would help us, but after upgrading we still get these odd calls.
Logs for what happens on the FS side are below. Notice the lingering socket at the end
EXECUTE [depth=0] SIPADDR enable_heartbeat(60)
21:08:20.804113 99.10% [WARNING] switch_core_session.c:1693 SIPADDR using scheduler due to bypass media or media is not established.
--> EXECUTE [depth=0] SIPADDR socket(HOST:PORT async full)
21:08:20.884113 99.10% [DEBUG] switch_ivr.c:632 SIPADDR Command Execute [depth=0] ring_ready()
EXECUTE [depth=0] SIPADDR ring_ready()
21:08:20.884113 99.10% [NOTICE] mod_sofia.c:2515 Ring-Ready SIPADDR!
21:08:20.884113 99.10% [DEBUG] sofia.c:7499 Channel SIPADDR entering state [early][180]
21:08:20.884113 99.10% [NOTICE] mod_dptools.c:1122 Ring Ready SIPADDR!
21:08:21.504157 99.10% [DEBUG] sofia.c:7499 Channel SIPADDR entering state [terminated][487]
--> 21:08:21.504157 99.10% [NOTICE] sofia.c:8738 Hangup SIPADDR [CS_EXECUTE] [ORIGINATOR_CANCEL]
21:08:21.504157 99.10% [DEBUG] switch_core_session.c:2983 SIPADDR skip receive message [PHONE_EVENT] (channel is hungup already)
21:08:21.504157 99.10% [DEBUG] switch_core_state_machine.c:647 (SIPADDR) State EXECUTE going to sleep
21:08:21.504157 99.10% [DEBUG] switch_core_state_machine.c:581 (SIPADDR) Running State Change CS_HANGUP (Cur 10 Tot 7501)
21:08:21.504157 99.10% [DEBUG] switch_core_state_machine.c:844 (SIPADDR) Callstate Change RINGING -> HANGUP
21:08:21.504157 99.10% [DEBUG] switch_core_state_machine.c:846 (SIPADDR) State HANGUP
21:08:21.504157 99.10% [DEBUG] mod_sofia.c:468 Channel SIPADDR hanging up, cause: ORIGINATOR_CANCEL
21:08:21.504157 99.10% [DEBUG] switch_core_state_machine.c:59 SIPADDR Standard HANGUP, cause: ORIGINATOR_CANCEL
21:08:21.504157 99.10% [DEBUG] switch_core_state_machine.c:846 (SIPADDR) State HANGUP going to sleep
21:08:21.504157 99.10% [DEBUG] switch_core_state_machine.c:616 (SIPADDR) State Change CS_HANGUP -> CS_REPORTING
21:08:21.504157 99.10% [DEBUG] switch_core_state_machine.c:581 (SIPADDR) Running State Change CS_REPORTING (Cur 10 Tot 7501)
21:08:21.504157 99.10% [DEBUG] switch_core_state_machine.c:932 (SIPADDR) State REPORTING
21:08:21.504157 99.10% [DEBUG] switch_core_state_machine.c:168 SIPADDR Standard REPORTING, cause: ORIGINATOR_CANCEL
21:08:21.504157 99.10% [DEBUG] switch_core_state_machine.c:932 (SIPADDR) State REPORTING going to sleep
21:08:21.504157 99.10% [DEBUG] switch_core_state_machine.c:607 (SIPADDR) State Change CS_REPORTING -> CS_DESTROY
21:08:21.504157 99.10% [DEBUG] switch_core_session.c:1753 Session 7500 (SIPADDR) Locked, Waiting on external entities
--> 21:08:21.544103 99.10% [DEBUG] mod_event_socket.c:1470 SIPADDR Socket Linger -1
As with the referenced issues, this keeps resources occupied. The only remedy that we've found is to restart Freeswitch.
The log entry "Socket Linger -1" shouldn't have anything to do with your issue.
Can you create a test case that can be replicated on demand for this bug please?
@mknecht have you figured out what caused this issue?
Hasn't happened recently anymore … and have never figured out how to make a reliable test case, sorry. :(
Happens to me in almost 99% of calls despite of which side drops the call. Here is the call that was dropped by the called party:
# grep 885d4ed6-a20e-424e-91d0-e0467d592e32 /var/log/freeswitch/freeswitch.log
...
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.200263 94.73% [DEBUG] switch_core_state_machine.c:844 (sofia/internal/[email protected]) Callstate Change ACTIVE -> HANGUP
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.200263 94.73% [DEBUG] switch_core_state_machine.c:846 (sofia/internal/[email protected]) State HANGUP
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.200263 94.73% [DEBUG] mod_sofia.c:463 sofia/internal/[email protected] Overriding SIP cause 480 with 200 from the other leg
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.200263 94.73% [DEBUG] mod_sofia.c:469 Channel sofia/internal/[email protected] hanging up, cause: NORMAL_CLEARING
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.224733 94.73% [DEBUG] mod_sofia.c:523 Sending BYE to sofia/internal/[email protected]
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.224733 94.73% [DEBUG] switch_core_state_machine.c:59 sofia/internal/[email protected] Standard HANGUP, cause: NORMAL_CLEARING
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.224733 94.73% [DEBUG] switch_core_state_machine.c:846 (sofia/internal/[email protected]) State HANGUP going to sleep
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.224733 94.73% [DEBUG] switch_core_state_machine.c:616 (sofia/internal/[email protected]) State Change CS_HANGUP -> CS_REPORTING
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.224733 94.73% [DEBUG] switch_core_state_machine.c:581 (sofia/internal/[email protected]) Running State Change CS_REPORTING (Cur 2 Tot 6)
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.224733 94.73% [DEBUG] switch_core_state_machine.c:932 (sofia/internal/[email protected]) State REPORTING
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.224733 94.73% [DEBUG] switch_core_state_machine.c:168 sofia/internal/[email protected] Standard REPORTING, cause: NORMAL_CLEARING
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.224733 94.73% [DEBUG] switch_core_state_machine.c:932 (sofia/internal/[email protected]) State REPORTING going to sleep
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.224733 94.73% [DEBUG] switch_core_state_machine.c:607 (sofia/internal/[email protected]) State Change CS_REPORTING -> CS_DESTROY
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.224733 94.73% [DEBUG] switch_core_session.c:1744 Session 5 (sofia/internal/[email protected]) Locked, Waiting on external entities
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.224733 94.73% [DEBUG] mod_event_socket.c:1470 sofia/internal/[email protected] Socket Linger -1
# /usr/bin/fs_cli -x "show calls" | grep 885d4ed6-a20e-424e-91d0-e0467d592e32
885d4ed6-a20e-424e-91d0-e0467d592e32,inbound,2023-10-13 08:37:32,1697186252,sofia/internal/[email protected],CS_SOFT_EXECUTE,703,703,xx.xx.174.66,pmq_mydomain,[email protected],,ACTIVE,mydomain,+xxxxxxx7914,SEND,885d4ed6-a20e-424e-91d0-e0467d592e32,ha-sp-2-3,mydomain,+xxxxxxx7914,,,,,,,,,,,,,,,,,,,
When the originator drops the call, the only difference is the missing last line with "Socket Linger":
# grep 4561a39d-a479-4433-b18b-d6ed636ac1b2 /var/log/freeswitch/freeswitch.log
...
4561a39d-a479-4433-b18b-d6ed636ac1b2 2023-10-13 09:15:32.160361 96.60% [DEBUG] switch_core_state_machine.c:932 (sofia/internal/[email protected]) State REPORTING going to sleep
4561a39d-a479-4433-b18b-d6ed636ac1b2 2023-10-13 09:15:32.160361 96.60% [DEBUG] switch_core_state_machine.c:607 (sofia/internal/[email protected]) State Change CS_REPORTING -> CS_DESTROY
4561a39d-a479-4433-b18b-d6ed636ac1b2 2023-10-13 09:15:32.160361 96.60% [DEBUG] switch_core_session.c:1744 Session 7 (sofia/internal/[email protected]) Locked, Waiting on external entities
# /usr/bin/fs_cli -x "show calls" | grep 4561a39d-a479-4433-b18b-d6ed636ac1b2
4561a39d-a479-4433-b18b-d6ed636ac1b2,inbound,2023-10-13 09:15:10,1697188510,sofia/internal/[email protected],CS_SOFT_EXECUTE,703,703,xx.xx.174.66,pmq_mydomain,[email protected],,ACTIVE,mydomain,+xxxxxxx7914,SEND,4561a39d-a479-4433-b18b-d6ed636ac1b2,ha-sp-2-3,mydomain,+xxxxxxx7914,,,,,,,,,,,,,,,,,,,
Any ideas what might cause this?
# rpm -q freeswitch
freeswitch-1.10.10.release.17-1.el7.x86_64