freeswitch icon indicating copy to clipboard operation
freeswitch copied to clipboard

Inbound calls not cleared after ORIGINATOR_CANCEL

Open mknecht opened this issue 3 years ago • 3 comments

For a while we've had sporadic inbound calls that stay in the “RINGING“ state. Even after days they are not cleared. I don't have a reliable way to reproduce this, but it seems related to canceling the request, see below.

As with #960, they show up for show calls like any other ringing call but running uuid_kill for such a channel (ID) yields only -ERR No such channel!. Other calls can be killed.

Here is the SIP dialogue for one of these calls. Half a second after we send 180/Ringing, the other side cancels the request.

No.	Time				Source		Destination	Protocol	Length	Info
1	2021-11-18 05:08:20.766911	THEIR.IP	OUR.IP		SIP/SDP	1362	Request: INVITE sip:+SIPADDR | 
2	2021-11-18 05:08:20.857954	OUR.IP		THEIR.IP	SIP	521	Status: 100 Trying | 
3	2021-11-18 05:08:20.941829	OUR.IP		THEIR.IP	SIP	1189	Status: 180 Ringing | 
4	2021-11-18 05:08:21.490194	THEIR.IP	OUR.IP		SIP	507	Request: CANCEL sip:+SIPADDR | 
5	2021-11-18 05:08:21.579186	OUR.IP		THEIR.IP	SIP	466	Status: 200 OK | 
6	2021-11-18 05:08:21.579192	OUR.IP		THEIR.IP	SIP	998	Status: 487 Request Terminated | 
7	2021-11-18 05:08:21.579236	THEIR.IP	OUR.IP		SIP	519	Request: ACK SIPADDR | 

Interestingly, our telco partner reported these calls as “no answer“. Both the cancel request and the “no answer“ reporting seem to consistently happen for those not-cleaned up calls. So maybe this is a concurrency issue: while we've got the call parked and try to figure out what to do with it (via event socket) the request is canceled and something is not cleaned up.

We were on 1.10.6 and I had hoped that the early media fix for 1.10.7 would help us, but after upgrading we still get these odd calls.

Logs for what happens on the FS side are below. Notice the lingering socket at the end

      EXECUTE [depth=0] SIPADDR enable_heartbeat(60)
      21:08:20.804113 99.10% [WARNING] switch_core_session.c:1693 SIPADDR using scheduler due to bypass media or media is not established.
-->   EXECUTE [depth=0] SIPADDR socket(HOST:PORT async full)
      21:08:20.884113 99.10% [DEBUG] switch_ivr.c:632 SIPADDR Command Execute [depth=0] ring_ready()
      EXECUTE [depth=0] SIPADDR ring_ready()
      21:08:20.884113 99.10% [NOTICE] mod_sofia.c:2515 Ring-Ready SIPADDR!
      21:08:20.884113 99.10% [DEBUG] sofia.c:7499 Channel SIPADDR entering state [early][180]
      21:08:20.884113 99.10% [NOTICE] mod_dptools.c:1122 Ring Ready SIPADDR!
      21:08:21.504157 99.10% [DEBUG] sofia.c:7499 Channel SIPADDR entering state [terminated][487]
-->   21:08:21.504157 99.10% [NOTICE] sofia.c:8738 Hangup SIPADDR [CS_EXECUTE] [ORIGINATOR_CANCEL]
      21:08:21.504157 99.10% [DEBUG] switch_core_session.c:2983 SIPADDR skip receive message [PHONE_EVENT] (channel is hungup already)
      21:08:21.504157 99.10% [DEBUG] switch_core_state_machine.c:647 (SIPADDR) State EXECUTE going to sleep
      21:08:21.504157 99.10% [DEBUG] switch_core_state_machine.c:581 (SIPADDR) Running State Change CS_HANGUP (Cur 10 Tot 7501)
      21:08:21.504157 99.10% [DEBUG] switch_core_state_machine.c:844 (SIPADDR) Callstate Change RINGING -> HANGUP
      21:08:21.504157 99.10% [DEBUG] switch_core_state_machine.c:846 (SIPADDR) State HANGUP
      21:08:21.504157 99.10% [DEBUG] mod_sofia.c:468 Channel SIPADDR hanging up, cause: ORIGINATOR_CANCEL
      21:08:21.504157 99.10% [DEBUG] switch_core_state_machine.c:59 SIPADDR Standard HANGUP, cause: ORIGINATOR_CANCEL
      21:08:21.504157 99.10% [DEBUG] switch_core_state_machine.c:846 (SIPADDR) State HANGUP going to sleep
      21:08:21.504157 99.10% [DEBUG] switch_core_state_machine.c:616 (SIPADDR) State Change CS_HANGUP -> CS_REPORTING
      21:08:21.504157 99.10% [DEBUG] switch_core_state_machine.c:581 (SIPADDR) Running State Change CS_REPORTING (Cur 10 Tot 7501)
      21:08:21.504157 99.10% [DEBUG] switch_core_state_machine.c:932 (SIPADDR) State REPORTING
      21:08:21.504157 99.10% [DEBUG] switch_core_state_machine.c:168 SIPADDR Standard REPORTING, cause: ORIGINATOR_CANCEL
      21:08:21.504157 99.10% [DEBUG] switch_core_state_machine.c:932 (SIPADDR) State REPORTING going to sleep
      21:08:21.504157 99.10% [DEBUG] switch_core_state_machine.c:607 (SIPADDR) State Change CS_REPORTING -> CS_DESTROY
      21:08:21.504157 99.10% [DEBUG] switch_core_session.c:1753 Session 7500 (SIPADDR) Locked, Waiting on external entities
-->   21:08:21.544103 99.10% [DEBUG] mod_event_socket.c:1470 SIPADDR Socket Linger -1

As with the referenced issues, this keeps resources occupied. The only remedy that we've found is to restart Freeswitch.

mknecht avatar Nov 18 '21 06:11 mknecht

The log entry "Socket Linger -1" shouldn't have anything to do with your issue.

dragos-oancea avatar Dec 29 '21 10:12 dragos-oancea

Can you create a test case that can be replicated on demand for this bug please?

Len-PGH avatar May 04 '22 14:05 Len-PGH

@mknecht have you figured out what caused this issue?

mirkobrankovic avatar Aug 22 '22 10:08 mirkobrankovic

Hasn't happened recently anymore … and have never figured out how to make a reliable test case, sorry. :(

mknecht avatar Feb 06 '23 14:02 mknecht

Happens to me in almost 99% of calls despite of which side drops the call. Here is the call that was dropped by the called party:

# grep 885d4ed6-a20e-424e-91d0-e0467d592e32 /var/log/freeswitch/freeswitch.log
...
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.200263 94.73% [DEBUG] switch_core_state_machine.c:844 (sofia/internal/[email protected]) Callstate Change ACTIVE -> HANGUP
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.200263 94.73% [DEBUG] switch_core_state_machine.c:846 (sofia/internal/[email protected]) State HANGUP
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.200263 94.73% [DEBUG] mod_sofia.c:463 sofia/internal/[email protected] Overriding SIP cause 480 with 200 from the other leg
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.200263 94.73% [DEBUG] mod_sofia.c:469 Channel sofia/internal/[email protected] hanging up, cause: NORMAL_CLEARING
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.224733 94.73% [DEBUG] mod_sofia.c:523 Sending BYE to sofia/internal/[email protected]
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.224733 94.73% [DEBUG] switch_core_state_machine.c:59 sofia/internal/[email protected] Standard HANGUP, cause: NORMAL_CLEARING
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.224733 94.73% [DEBUG] switch_core_state_machine.c:846 (sofia/internal/[email protected]) State HANGUP going to sleep
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.224733 94.73% [DEBUG] switch_core_state_machine.c:616 (sofia/internal/[email protected]) State Change CS_HANGUP -> CS_REPORTING
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.224733 94.73% [DEBUG] switch_core_state_machine.c:581 (sofia/internal/[email protected]) Running State Change CS_REPORTING (Cur 2 Tot 6)
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.224733 94.73% [DEBUG] switch_core_state_machine.c:932 (sofia/internal/[email protected]) State REPORTING
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.224733 94.73% [DEBUG] switch_core_state_machine.c:168 sofia/internal/[email protected] Standard REPORTING, cause: NORMAL_CLEARING
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.224733 94.73% [DEBUG] switch_core_state_machine.c:932 (sofia/internal/[email protected]) State REPORTING going to sleep
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.224733 94.73% [DEBUG] switch_core_state_machine.c:607 (sofia/internal/[email protected]) State Change CS_REPORTING -> CS_DESTROY
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.224733 94.73% [DEBUG] switch_core_session.c:1744 Session 5 (sofia/internal/[email protected]) Locked, Waiting on external entities
885d4ed6-a20e-424e-91d0-e0467d592e32 2023-10-13 08:37:55.224733 94.73% [DEBUG] mod_event_socket.c:1470 sofia/internal/[email protected] Socket Linger -1

# /usr/bin/fs_cli -x "show calls" | grep 885d4ed6-a20e-424e-91d0-e0467d592e32
885d4ed6-a20e-424e-91d0-e0467d592e32,inbound,2023-10-13 08:37:32,1697186252,sofia/internal/[email protected],CS_SOFT_EXECUTE,703,703,xx.xx.174.66,pmq_mydomain,[email protected],,ACTIVE,mydomain,+xxxxxxx7914,SEND,885d4ed6-a20e-424e-91d0-e0467d592e32,ha-sp-2-3,mydomain,+xxxxxxx7914,,,,,,,,,,,,,,,,,,,

When the originator drops the call, the only difference is the missing last line with "Socket Linger":

# grep 4561a39d-a479-4433-b18b-d6ed636ac1b2 /var/log/freeswitch/freeswitch.log
...
4561a39d-a479-4433-b18b-d6ed636ac1b2 2023-10-13 09:15:32.160361 96.60% [DEBUG] switch_core_state_machine.c:932 (sofia/internal/[email protected]) State REPORTING going to sleep
4561a39d-a479-4433-b18b-d6ed636ac1b2 2023-10-13 09:15:32.160361 96.60% [DEBUG] switch_core_state_machine.c:607 (sofia/internal/[email protected]) State Change CS_REPORTING -> CS_DESTROY
4561a39d-a479-4433-b18b-d6ed636ac1b2 2023-10-13 09:15:32.160361 96.60% [DEBUG] switch_core_session.c:1744 Session 7 (sofia/internal/[email protected]) Locked, Waiting on external entities

# /usr/bin/fs_cli -x "show calls" | grep 4561a39d-a479-4433-b18b-d6ed636ac1b2
4561a39d-a479-4433-b18b-d6ed636ac1b2,inbound,2023-10-13 09:15:10,1697188510,sofia/internal/[email protected],CS_SOFT_EXECUTE,703,703,xx.xx.174.66,pmq_mydomain,[email protected],,ACTIVE,mydomain,+xxxxxxx7914,SEND,4561a39d-a479-4433-b18b-d6ed636ac1b2,ha-sp-2-3,mydomain,+xxxxxxx7914,,,,,,,,,,,,,,,,,,,

Any ideas what might cause this?

# rpm -q freeswitch
freeswitch-1.10.10.release.17-1.el7.x86_64

jazzl0ver avatar Oct 13 '23 09:10 jazzl0ver