WebSocket-for-Python
WebSocket-for-Python copied to clipboard
Server crashes with broken pipe if client disconnects ungracefully?
Hey, Love your library but having a little issue with cherrypy 3.3.0 on python 3.3.3 using chromium Version 31.0.1650.63 Debian jessie/sid (238485) on linux mint.
Basically, i can connect to the websocket, but if i close that tab and reopen it, it does not work. If I call socket.close in the client first, the problem dissapears. the server log shows a broken pipe error here in websocket.py around line 232:
def _write(self, b):
"""
Trying to prevent a write operation
on an already closed websocket stream.
This cannot be bullet proof but hopefully
will catch almost all use cases.
"""
if self.terminated or self.sock is None:
raise RuntimeError("Cannot send on a terminated websocket")
self.sock.sendall(b)
I found a possible fix, I changed the function to: def _write(self, b): """ Trying to prevent a write operation on an already closed websocket stream.
This cannot be bullet proof but hopefully
will catch almost all use cases.
"""
if self.terminated or self.sock is None:
raise RuntimeError("Cannot send on a terminated websocket")
try:
self.sock.sendall(b)
except BrokenPipeError:
pass
And the error dissapears. I assume that ignoring the error is fine because we are closing the whole socket anyway, however I don't know much about the inner workings of ws4py.
Thanks for the write up. Those corner cases are always tricky to reproduce I guess. I will try to find some time to analyze your proposed solution and will merge it if it fits.
Thanks again!
After looking into it a little more, this won't work because BrokenPipeError only exists in python 3. Also, the error always seems to come from the close() function only, as in
Exception in thread Thread-1: Traceback (most recent call last): File "/media/Files/Programming/KaithemAutomation/kaithem/src/thirdparty/ws4py/websocket.py", line 185, in close self._write(self.stream.close(code=code, reason=reason).single(mask=self.stream.always_mask)) File "/media/Files/Programming/KaithemAutomation/kaithem/src/thirdparty/ws4py/websocket.py", line 254, in _write self.sock.sendall(b) File "/usr/lib/python3.3/ssl.py", line 460, in sendall v = self.send(data[count:]) File "/usr/lib/python3.3/ssl.py", line 421, in send v = self._sslobj.write(data) BrokenPipeError: [Errno 32] Broken pipe
The errors are occuring in the websocket manager thread. My current solution is to treat any error that occurs during the call of once() that the manager does as if it returned a None. It would probably be better to not treat all errors the same but for production use I think it makes sense, because if the manager thread is pretty important to keep running.
Not sure if that was what you meant, but my workaround is to simply override the once() function and close the connection if any error occurs running once(). TestWebSocket is my class that inherits from WebSocket:
def once(self):
try:
return super(TestWebSocket, self).once()
except Exception as e:
print("Closing socket because of error:" + str(e))
self.close()
Wouldn't a nicer solution to have an "handle_error" function on WebSocket that the user can override if there are any problems? And a try over the once()-call in run()? Its behavior could be to write any unhandled errors to the log and close the connection.
Having crashing the entire server as expected behavior if unhandled errors occurs seems like a quite controversial solution. :-)
I can write a pull request if you think something like this is a usable solution.
I do like the handle_error callback handler, specially since it has been often requested.
Ok, I'll make something.
Hm. I was going to make this thing now, and when I tried to recreate the issue (with the master branch) i couldn't.
It seems that it had been fixed in some later commit. @EternityForest, can you recreate the problem with the current master?
Also, I am not sure if an error handler is needed now or how it is to be invoked, as there seem to be some other handling somewhere.
It would appear that a new release would be a good thing. :-)
Ok, now I managed to recreate it. It was only the client that had started to disconnect more gracefully. I'll be back.
I got the same issue with a timeout. After that, the server just open/close websocket.
Exception in thread Thread-2:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/usr/local/lib/python2.7/dist-packages/ws4py/manager.py", line 310, in run
if not ws.once():
File "/usr/local/lib/python2.7/dist-packages/ws4py/websocket.py", line 388, in once
if not self.process(b):
File "/usr/local/lib/python2.7/dist-packages/ws4py/websocket.py", line 466, in process
self._write(s.pong(ping.data))
File "/usr/local/lib/python2.7/dist-packages/ws4py/websocket.py", line 279, in _write
self.sock.sendall(b)
File "/usr/lib/python2.7/socket.py", line 228, in meth
return getattr(self._sock,name)(*args)
timeout: timed out
IMO, without even trying to reproduce, manager should never stop running due to something bad in a websocket. So catching any exception and force-close the websocket that have an issue should be done in the code to prevent the manager and others websocket to be impacted :)
I have a modified version of the library in use that seems to handle ungraceful disconnect and suspend/wake just fine. I should probably do some more testing with different browsers.
I think there might be a development branch of this repo, or else some pull requests that fix it. I think I might even have submitted some but I'm on mobile and it was a while back.
The code I've been using is here though, and it seems to work fine running 24/7 at a commercial installation albeit without a lot of ungraceful disconnects or anything like that.
https://github.com/EternityForest/KaithemAutomation/tree/master/kaithem/src/thirdparty/ws4py
Thanks for the feedback @EternityForest . I'm currently testing the try/catch in once() in production. I'm not using webbrowser, but ws4py client to ws4py server, using python on android, under very bad network condition. And the server hangs too easily.
I'll check your code, thanks a lot!
I wonder if it wouldn't be helpful to add some of this to the unit tests? Could we create a loopback client that was modular and could be plugged into the different servers, that tested things like disconnection, bad data, regression testing that SSL one message long delay queue bug, etc?
Yes, i guess. But some are so network stack related it's hard to get them. After implementing the try catch in once, it got further to another exception:
Exception in thread Thread-2:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/usr/local/lib/python2.7/dist-packages/ws4py/manager.py", line 312, in run
fd = ws.sock.fileno()
AttributeError: 'NoneType' object has no attribute 'fileno'
I really wish to have commit access at least to solve thoses issues that can be so easy to get with bad network :(
EDIT: i though you fixed it in your own version, but no
I could add collaborators to this project I guess.
@Lawouach i just wrote you an email as an introduction to a possible collaboration :)
Line 312 in my version doesn't check fileno, and my version has the server thread named "WebsocketManager", so I'm guessing you're using a patched version?
I think my version and the official one have diverged to the point where patches aren't compatible?
Although I'm not seeing what could possibly be setting the sock property to None, unless it has something to do with the Heartbeat thread, but that's only for the client right?
It seems like somehow the once() gets called on an already terminated WebSocket that has no sock() property, and somehow makes it through the polling, or else somehow gets terminated between the poll and the check?
But in my version so far as I know termination can only happen from the manager thread, at least on cherrypy. Is it different in the official version or on other servers?
This is an odd bug.
Here's a proper fork that might be easier to test, that should be an exact copy and paste of what I've been using in production: https://github.com/EternityForest/WebSocket-for-Python
EDIT:
Just tested it with Cherrypy, chrome as the client, and the echo_cherrypy_server. If chrome is killed by the SIGKILL when running over SSL, I get:
[2017-03-23 14:52:17,012] ERROR Terminating websocket [Local => 127.0.0.1:9000 | Remote => 127.0.0.1:36204] due to exception: SysCallError(-1, 'Unexpected EOF') in once method
And then it continues functioning as before. Without SSL, it doesn't even seem to notice that the disconnection didn't happen properly, and prints the normal disconnect message.
I believe this is finally fixed. Please open a new issue if you have any thing going on with latest version (> 0.3.5, use master for now until next version is released.)
Still experiencing the same issue on 0.4.2. My solution is incredibly hacky but connection cleanup works fine. (It's a monkey-patch)
from ws4py.websocket import WebSocket as WebSocket_
class WebSocket(WebSocket_):
def _write(self, b):
"""
Trying to prevent a write operation
on an already closed websocket stream.
This cannot be bullet proof but hopefully
will catch almost all use cases.
"""
if self.terminated or self.sock is None:
raise RuntimeError("Cannot send on a terminated websocket")
"""
If the client disconnects in an unexpected way.
"""
try:
self.sock.sendall(b)
except BrokenPipeError:
pass
Is this error still shutting down the entire server thread? Can we get a full traceback?
Sometimes a print statement is executed that does not kill the server thread.
Error when terminating the connection: [Errno 32] Broken pipe
Sometimes an exception is raised that does kill the server thread. I'm not able to reproduce it at the moment, will get try to do so as soon as possible.