jain-sip icon indicating copy to clipboard operation
jain-sip copied to clipboard

OK reponse for INVITE gets retransmitted although ACK was received

Open aliebig opened this issue 8 years ago • 2 comments

There is a race condition between starting the retransmit timer for the OK response to an INVITE and the ACK processing.

The issue occurs when the ACK is received and processed shortly after sending the OK response. This happens in real operation and can easily be reproduced by setting a thread suspending breakpoint in SIPServerTransactionImpl (see below).

SIPServerTransactionImpl
public void sendResponse(Response response) throws SipException {
 [...]
 this.sendMessage((SIPResponse)response);
 if(dialog != null) {			//<<<<<<<<<<<<<<<<<<< Breakpoint here
	dialog.startRetransmitTimer(this, (SIPResponse)response);
 }
 [...]
}

How to reproduce:

  1. Set breakpoint
  2. Initiate incoming call
  3. Accept incoming call
  4. OK is sent, breakpoint is hit
  5. Make sure ACK is received
  6. Resume thread

Expected:

  • Since ACK was received, OK is not retransmitted

Observed:

  • OK gets retransmitted when its timer triggers

Analysis

The root cause of the issue is that the OK gets sent and afterwards its retransmit timer task gets started. If the incoming ACK is processed before the timer is started, the OK will get retransmitted because the timer task has not been started yet when the ACK is processed, and therefore does not get stopped.

Sending of OK

SIPServerTransactionImpl
public void sendResponse(Response response) throws SipException {
 [...]
 this.sendMessage((SIPResponse)response);
 if(dialog != null) {
[JainSipOkAckRaceCondition.txt](https://github.com/RestComm/jain-sip/files/726293/JainSipOkAckRaceCondition.txt)
[JainSipOkAckRaceCondition.txt](https://github.com/RestComm/jain-sip/files/726300/JainSipOkAckRaceCondition.txt)


	// If received ACK is processed before the next instruction, then the issue occurs
	dialog.startRetransmitTimer(this, (SIPResponse)response);
	// this subsequently calls SipDialog.startRetransmitTimer() which sets the timerTask
 }

Receiving of ACK

SipDialog.ackReceived(long cseqNumber){
	[...]
	SIPServerTransaction tr = this.getInviteTransaction();
		// gets transaction from timerTask if timerTask != null. Since timer has not been started yet, returns null
        if(tr != null) {
		// => this is where the timerTask would get stopped
	       [...]
        } else if(logger.isLoggingEnabled(32)) {
                logger.logDebug("tr is null -- not updating the ack state");
        }	
}

The resulting thread ordering and log output is as follows: [Thread 1] sipServerTransaction::sendMessage SIP/2.0 200 OK [Thread 2] processRequest: ACK sip:[email protected]:5070 SIP/2.0 [Thread 2] tr is null -- not updating the ack state; [Thread 1] startRetransmitTimer() 200 method INVITE;

aliebig avatar Jan 24 '17 10:01 aliebig

hi @aliebig thanks for comprehensive report.

Unfortunately JAIN-SIP stack is not very good at these scenarios, with different thread pools for traffic and timers. Would you be interested on contributing with a patch here? It seems you already have a good understanding on the situation... We may discuss solutions if you like, thoughts?

jaimecasero avatar Mar 17 '17 14:03 jaimecasero

Unfortunately, I currently do not have enough time available at the moment to work on the issue. Sorry.

If time permits it in the future, and the issue is still open, I would be happy to contribute and provide a patch.

aliebig avatar Apr 05 '17 07:04 aliebig