activemessaging icon indicating copy to clipboard operation
activemessaging copied to clipboard

High number of disconnects & growth of CLOSE_WAIT connections

Open mrgordon opened this issue 11 years ago • 3 comments

I run the poller using RAILS_ENV=integration bundle exec script/poller run. It processes messages correctly although I get relatively frequent "resetting connection" messages.

poller: process with pid 74811 started.
connection.receive returning EOF as nil - resetting connection.
connection.receive returning EOF as nil - resetting connection.
connection.receive returning EOF as nil - resetting connection.
connection.receive returning EOF as nil - resetting connection.
connection.receive returning EOF as nil - resetting connection.
connection.receive returning EOF as nil - resetting connection.
connection.receive returning EOF as nil - resetting connection.
connection.receive returning EOF as nil - resetting connection.
connection.receive returning EOF as nil - resetting connection.

I see CLOSE_WAIT connections piling up in "sudo lsof | grep 61613":

ruby      74811     user   15u     IPv6 0xffffff80118a6d80         0t0      TCP localhost:50892->localhost:61613 (CLOSE_WAIT)
ruby      74811     user   16u     IPv6 0xffffff80118a6600         0t0      TCP localhost:50865->localhost:61613 (CLOSE_WAIT)
ruby      74811     user   17u     IPv6 0xffffff800ea70600         0t0      TCP localhost:50896->localhost:61613 (CLOSE_WAIT)
ruby      74811     user   18u     IPv6 0xffffff80118a5340         0t0      TCP localhost:50906->localhost:61613 (CLOSE_WAIT)
ruby      74811     user   19u     IPv6 0xffffff80118a5ac0         0t0      TCP localhost:50925->localhost:61613 (CLOSE_WAIT)
ruby      74811     user   20u     IPv6 0xffffff80118a4f80         0t0      TCP localhost:50928->localhost:61613 (CLOSE_WAIT)
ruby      74811     user   21u     IPv6 0xffffff80118a4bc0         0t0      TCP localhost:50954->localhost:61613 (CLOSE_WAIT)
ruby      74811     user   22u     IPv6 0xffffff80118a5700         0t0      TCP localhost:50964->localhost:61613 (CLOSE_WAIT)
ruby      74811     user   23u     IPv6 0xffffff80118a4800         0t0      TCP localhost:50974->localhost:61613 (CLOSE_WAIT)
ruby      74811     user   24u     IPv6 0xffffff80118a4440         0t0      TCP localhost:50978->localhost:61613 (ESTABLISHED)

My HornetQ logs are filled up with the disconnects as well:

13:17:59,475 WARN  [org.hornetq.core.server] HQ222067: Connection failure has been detected: HQ119014: Did not receive data from /0:0:0:0:0:0:0:1%0:50982. It is likely the client has exited or crashed without closing its connection, or the network between the server and client has failed. You also might have configured connection-ttl and client-failure-check-period incorrectly. Please check user manual for more information. The connection will now be closed. [code=CONNECTION_TIMEDOUT]
13:17:59,476 WARN  [org.hornetq.core.server] HQ222061: Client connection failed, clearing up resources for session 98b76aff-a21b-11e2-8568-695c1cfc13ed
13:17:59,477 WARN  [org.hornetq.core.server] HQ222107: Cleared up resources for session 98b76aff-a21b-11e2-8568-695c1cfc13ed
13:19:01,506 WARN  [org.hornetq.core.server] HQ222067: Connection failure has been detected: HQ119014: Did not receive data from /0:0:0:0:0:0:0:1%0:50987. It is likely the client has exited or crashed without closing its connection, or the network between the server and client has failed. You also might have configured connection-ttl and client-failure-check-period incorrectly. Please check user manual for more information. The connection will now be closed. [code=CONNECTION_TIMEDOUT]
13:19:01,509 WARN  [org.hornetq.core.server] HQ222061: Client connection failed, clearing up resources for session bdb071e0-a21b-11e2-8568-695c1cfc13ed
13:19:01,510 WARN  [org.hornetq.core.server] HQ222107: Cleared up resources for session bdb071e0-a21b-11e2-8568-695c1cfc13ed

Is this expected behavior? Should I be doing something differently?

mrgordon avatar Apr 10 '13 20:04 mrgordon

It looks like the CLOSE_WAITs go away after approximately an hour, but I'd still love to know if it is expected that I would see so many disconnects.

mrgordon avatar Apr 10 '13 22:04 mrgordon

Hello,

Can you tell me what broker / adapter you are using, and version of ruby?

Andrew Kuklewicz

On Wed, Apr 10, 2013 at 6:06 PM, Matthew Gordon [email protected]:

It looks like the CLOSE_WAITs go away after approximately an hour, but I'd still love to know if it is expected that I would see so many disconnects.

— Reply to this email directly or view it on GitHubhttps://github.com/kookster/activemessaging/issues/32#issuecomment-16205469 .

kookster avatar Apr 10 '13 22:04 kookster

Ruby: 1.9.3-p125 Adapter: STOMP

mrgordon avatar Apr 10 '13 22:04 mrgordon