xmpp4r icon indicating copy to clipboard operation
xmpp4r copied to clipboard

Intermittent connection failure under jruby 1.5.1 Sun jre1.6 (works under jruby 1.4, java 1.5)

Open alx1 opened this issue 15 years ago • 3 comments

Below is the debug log from a failed connection attempt. The code is just calling Simple.new (using xmpp4r-simple) with no other calls/threads etc beforehand. Re-running the same script a couple of times will usually result in a normal, successful connection.

The different between a successful log and an unsuccessful (up to the point of failure) is that successful connections always seem to receive the features before the feature waiting code in accept_features (connection.rb). Notice the lack of a "PROCESSING" line below. Adding "sleep 5" at line 101 (after @features_sem.wait) seems to workaround the issue but this is obviously less than ideal.

Let me know if you need more info.

D, [2010-07-14T13:31:37.902000 #14974] DEBUG -- : Debugging mode enabled. W, [2010-07-14T13:31:37.906000 #14974] WARN -- : Warnings mode enabled. D, [2010-07-14T13:31:37.916000 #14974] DEBUG -- : RESOLVING: _xmpp-client._tcp.deleted D, [2010-07-14T13:31:39.546000 #14974] DEBUG -- : CONNECTING: [deleted]:5222 D, [2010-07-14T13:31:40.836000 #14974] DEBUG -- : SENDING: <stream:stream xmlns:stream='http://etherx.jabber.org/streams' xmlns='jabber:client' to='[deleted]' xml:lang='en' version='1.0' > D, [2010-07-14T13:31:41.952000 #14974] DEBUG -- : RECEIVED: <stream:stream xmlns='jabber:client' xmlns:stream='http://etherx.jabber.org/streams' id='103610479' from='[deleted]' version='1.0' xml:lang='en'/> D, [2010-07-14T13:31:41.965000 #14974] DEBUG -- : RECEIVED: stream:featuresDIGEST-MD5PLAIN/stream:features D, [2010-07-14T13:31:41.970000 #14974] DEBUG -- : FEATURES: waiting... D, [2010-07-14T13:31:41.973000 #14974] DEBUG -- : FEATURES: received D, [2010-07-14T13:31:41.974000 #14974] DEBUG -- : FEATURES: waiting finished D, [2010-07-14T13:31:41.975000 #14974] DEBUG -- : SENDING: /opt/jruby-1.5.1/lib/ruby/gems/1.8/gems/xmpp4r-0.5/lib/xmpp4r/errors.rb:43:in to_s': undefined methoderror' for nil:NilClass (NoMethodError) from /opt/jruby-1.5.1/lib/ruby/gems/1.8/gems/xmpp4r-0.5/lib/xmpp4r/connection.rb:111:in accept_features' from /opt/jruby-1.5.1/lib/ruby/gems/1.8/gems/xmpp4r-0.5/lib/xmpp4r/connection.rb:79:inconnect' from /opt/jruby-1.5.1/lib/ruby/gems/1.8/gems/xmpp4r-0.5/lib/xmpp4r/client.rb:70:in connect' from /opt/jruby-1.5.1/lib/ruby/gems/1.8/gems/xmpp4r-simple-0.8.8/lib/xmpp4r-simple.rb:394:inconnect!' from /opt/jruby-1.5.1/lib/ruby/gems/1.8/gems/xmpp4r-simple-0.8.8/lib/xmpp4r-simple.rb:322:in client' from /opt/jruby-1.5.1/lib/ruby/gems/1.8/gems/xmpp4r-simple-0.8.8/lib/xmpp4r-simple.rb:331:insend!' from /opt/jruby-1.5.1/lib/ruby/gems/1.8/gems/xmpp4r-simple-0.8.8/lib/xmpp4r-simple.rb:147:in status' from /opt/jruby-1.5.1/lib/ruby/gems/1.8/gems/xmpp4r-simple-0.8.8/lib/xmpp4r-simple.rb:90:ininitialize' from /opt/clan/client.rb:42:in `new' from /opt/clan/client.rb:42

alx1 avatar Jul 14 '10 13:07 alx1

We're seeing what looks like the same problem quite frequently under one environment but not under another. In both environments we're using version 0.5 of the xmpp4r gem. Adding the sleep as described in the original issue seems to fix the problem. At first glance, it looks as if #run is being called on the features semaphore (@features_sem) when the features stanza is detected, but before the stanza is processed. So it looks as if (in the #starttls method) the Connection class sees the features response stanza when it's expecting to see the starttls response stanza. But we may have misunderstood.

Debug output for failing case: https://gist.github.com/1067700

The following environment is the one in which the problem does occur :-

$ uname -a
Linux vnsapi1 2.6.38-8-generic #42-Ubuntu SMP Mon Apr 11 03:31:24 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=11.04
DISTRIB_CODENAME=natty
DISTRIB_DESCRIPTION="Ubuntu 11.04"

$ java -version
java version "1.6.0_22"
OpenJDK Runtime Environment (IcedTea6 1.10.1) (6b22-1.10.1-0ubuntu1)
OpenJDK 64-Bit Server VM (build 20.0-b11, mixed mode)

$ ruby --version
jruby 1.6.2 (ruby-1.8.7-p330) (2011-05-23 e2ea975) (OpenJDK 64-Bit Server VM 1.6.0_22) [linux-amd64-java]

But it doesn't seem to happen at all under the following environment (on Mac OSX 10.6.7) :-

$ uname -a
Darwin floehopper 10.7.0 Darwin Kernel Version 10.7.0: Sat Jan 29 15:17:16 PST 2011; root:xnu-1504.9.37~1/RELEASE_I386 i386

$ java -version
java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07-334-10M3326)
Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02-334, mixed mode)

$ ruby -v
jruby 1.6.2 (ruby-1.8.7-p330) (2011-05-23 e2ea975) (Java HotSpot(TM) 64-Bit Server VM 1.6.0_24) [darwin-x86_64-java]

floehopper avatar Jul 06 '11 16:07 floehopper

I think the problem is the proceed instruction after starttls:

<proceed xmlns='urn:ietf:params:xml:ns:xmpp-tls'/>

We have the same problem with logstash 1.2.1 and it's XMPP output, also crashing after the proceed server response. Java is 1.7.0_25, OpenJDK Runtime Environment (IcedTea 2.3.10).

cweiske avatar Sep 10 '13 09:09 cweiske

I'm also seeing this with logstash 1.4.2 / xmpp4r 0.5

I'm thinking it might be the following clause at the end of the starttls method that causes it:

rescue
  error = $!
ensure
  Jabber::debuglog("TLSv1: restarting parser")
  start
  accept_features
  raise error if error
end

jrgns avatar Oct 16 '14 04:10 jrgns