em-websocket icon indicating copy to clipboard operation
em-websocket copied to clipboard

unacked close behavior needs minor improvement

Open sethcall opened this issue 11 years ago • 6 comments

One of two suggestions:

  • The new unack-close detection should never throw exception to server, in the argument that a em-websocket server implementation can't do anything useful with it. A further argument could be made that onerror only makes sense for connections that are up... not receiving a close ACK is really on that hairy edge of being able to say, 'this connection is still up'. After all, the server implementation has already asked em-websocket to close the connection; it's a bit surprising to receive a error 60 seconds later for in a new context (i.e., in onerror).
  • The Connection#close method should be a no-op in the case that the connection has already been closed.

I prefer the former, based on my own experience building client/server applications. It never make sense, in my experience, to raise exceptions after a close.

If you disagree, OK, that's fine, but to illustrate my problem, I'll need to make my onerror handler a little bit better. Currently I do this:

client.onerror { |error|
  if error.kind_of?(EM::WebSocket::WebSocketError)
    @log.error "websockets error: #{error}"
  else
    @log.error "generic error: #{error} #{error.backtrace}"
  end
  cleanup_client(client)

  client.close # XXX: this causes a error loop; in exactly 60 seconds, I get an onerror message again... and then I'll go through this loop and get the same error again, indefinitely
end

So either I (and every consumer) need to guard against trying to call client.close on a disconnected client (which I don't see how to do yet), or you can help us on this.

Thanks much for your hard work, Seth Call

sethcall avatar Apr 28 '14 20:04 sethcall

By the way, I've encountered this scenario at least 20-30 times since this was released, so it's not too rare in practice.

sethcall avatar Apr 28 '14 20:04 sethcall

Hi Seth. Thanks for writing this up - I hadn't considered this.

The reason I haven't hit into this issue is that I do not call close in my onerror handler. There is no reason to do this - em-websocket takes care of closing the connection in every error case. This should really be noted in the readme since it's non-obvious.

That said, there are still race conditions where close could be closed multiple times, which would result in the onerror callback being called multiple times or erroneously. I therefore agree that close should be changed to be a no-op in the case that close is in-progress.

Whether the lack of close ack should be exposed to the application code at all is a valid question; however other protocol errors (e.g. invalid UTF8) do feed through to onerror, and in my experience it can be useful to discover clients/connections that are not behaving according to spec - so I'd rather err on the side of too much information rather than too little.

mloughran avatar Apr 29 '14 14:04 mloughran

Thanks for the response.

Since I don't need to call close in onerror, that will simplify my logic and remove my current 'un-acked' loop.

In looking through code, it seems onerror may be the only message you get in the case of receive_data throwing an exception. (you won't get an onclose). So I'll need to make sure my code can safely try to cleanup a client in both onclose, and onerror, and deal with the case it's called > 1 time (because I'm pretty sure there are cases where you can get both).

Thanks for your help, Seth

sethcall avatar Apr 29 '14 14:04 sethcall

Not quite - in all cases unbind will be called on the Connection object when the tcp connection closes (whether that was done by the client or server), which causes on_close to be called. So you can safely put all your cleanup logic in on_close; I should also add this to the readme :)

mloughran avatar Apr 29 '14 15:04 mloughran

oh! Thank you for clarifying that; much appreciated.

sethcall avatar Apr 29 '14 15:04 sethcall

In this subject i have another problem with onclose. I use a redis counter to count the connected clients. Increment on onopen und decrement on onclose. What i have noticed is, that onclose ist called many times more then onopen and after some hours i have a negative count. in my opinion should every onopen call followed by a onclose call, but it isn't. is this a desired behavior or is it a bug?

Kenterfie avatar Sep 29 '15 11:09 Kenterfie