Jabber-Net
Jabber-Net copied to clipboard
Jabber response is not delivered to recipient if sendnig thread is the same that was listened before
We are using Jabber as interprocess communication system between different part of our application. We have file explorer addin and Agent. The Addin communicates with the Agent to get necessary information.
Recently we have encountered with the issue, when the addin was unable to get Jabber response from Agent, while the Agent reported that the response was sent, and prosody logs confirmed that.
I have discovered curious dependency from our logs: if Jabber request was sent from the same thread that was received the response before, it is unable to receive the response. You can see what I mean from the logs below:
20 Sep 2018 15:15:52,970 [1] DEBUG: !!SendPayload send
20 Sep 2018 15:15:52,994 [19] DEBUG: !!SendPayload received
20 Sep 2018 15:15:53,133 [23] DEBUG: !!SendPayload send
20 Sep 2018 15:15:53,158 [19] DEBUG: !!SendPayload received
20 Sep 2018 15:15:56,031 [1] DEBUG: !!SendPayload send
20 Sep 2018 15:15:56,038 [19] DEBUG: !!SendPayload received
20 Sep 2018 15:15:56,041 [1] DEBUG: !!SendPayload send
20 Sep 2018 15:15:56,062 [19] DEBUG: !!SendPayload received
20 Sep 2018 15:15:56,213 [19] DEBUG: !!SendPayload send
As you can see, when thread 19 was attempted to send the request, it was not able to receive the response.
Truncated part of our code operating with Jabber:
protected TResponseType SendPayload<TRequestType, TResponseType>(TRequestType request)
where TRequestType : class
where TResponseType : class
{
var resetEvent = new AutoResetEvent(false);
try
{
Context.Current.Logger.DebugFormat("!!SendPayload send {0}", request);
Exception ex = null;
var response = default(TResponseType);
SendPayload(request, (state, iq) =>
{
try
{
if (iq.Type.HasFlag(IQType.error))
{
ex = CreateAndLogException(iq);
Context.Current.Logger.ErrorFormat("Error Request: {0}", request.ToString());
return;
}
response = Payload.Get<TResponseType>(iq);
}
catch (Exception innerEx)
{
ex = innerEx;
}
finally
{
resetEvent.Set();
Context.Current.Logger.DebugFormat("!!SendPayload received {0}", request);
}
});
var token = _cancelTokenSource.Token;
var waitResult = WaitHandle.WaitAny(new WaitHandle[] { resetEvent, token.WaitHandle }, ServiceTimeout);
}
......................
}
catch (Exception ex)
{
Context.Current.Logger.Error("SendPayload Exception.", ex);
throw;
}
finally
{
resetEvent.Close();
}
}
The stack trace stops on WaitHandle.WaitAny() command.
Actually, I'm not sure whether it is our application issue, or Jabber configuration issue, or Jabber code issue. Any suggestions are welcome.
P.S. We didn't get any errors with that code before. It was occurred when we tried to add new functionality into the Addin. Unfortunately, we don't control Addin threading, just file explorer does it. I'm trying to resolve if from Addin side (currently without success), but I hope you can provide any ideas what is happened here...
Yeah, I'm aware of gross threading issues in Jabber-Net and I'm trying to fix them whenever I can. It is likely that your issue is caused by Jabber-Net code.
The current situation is that I could really use some help on this kind of issues. They are extremely hard to reproduce and are extremely hard to fix without breaking anything :(
For now, I'm mainly working on #95 and it required a big rework of our internals. If you could debug your issue and provide any local "fix" for it, I think I'd merge that for now.
Unit tests are much appreciated!
Could you please also provide information about what version are you using?
It was reproduced on old version 2.1.0.710, but before creating the issue I was upgraded my project to 3.0.0.37668, and it was reproduced anyway.
Btw, we're continue using of old 2.1.0 due to unreleased critical (for us) issue. Just waiting for merging and releasing. :)
Regarding to the help from my side: unfortunately I can't provide an example, because it is part of enterprise solution, and file explorer addin uses licensed component which I can't share.
But I can help with testing of any versions of JabberNet for that case.
Yes, I understand that #95 is critical pretty much for everyone. It is very painful to fix, but I'm working on it.