websockets
websockets copied to clipboard
Unhandled exception during WebSocketServer.close
I am using python 3.10.4 and websockets 10.3. As part of my shutdown during a SystemExit, I call close followed by wait_close on my WebSocketServer object. There are active client connections from a JavaScript-based client at the time. Asyncio occasionally reports that an unhandled exception occurred which I listed below.
I am able to work around the issue by using a custom asyncio loop set_exception_handler to suppress reporting the exception. However, I wanted to report this issue as it seems to me there is a decent chance it is a bug in websockets.
Task exception was never retrieved
future: <Task finished name='Task-52' coro=<WebSocketCommonProtocol.close() done, defined at /Users/johnmccrone/miniforge3/envs/wside/lib/python3.10/site-packages/websockets/legacy/protocol.py:725> exception=RuntimeError('Cannot call write() after write_eof()')>
Traceback (most recent call last):
<backtrace for code that called sys.exit(0)>
SystemExit: 0
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/johnmccrone/miniforge3/envs/wside/lib/python3.10/site-packages/websockets/legacy/protocol.py", line 750, in close
await asyncio.wait_for(
File "/Users/johnmccrone/miniforge3/envs/wside/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
return fut.result()
File "/Users/johnmccrone/miniforge3/envs/wside/lib/python3.10/site-packages/websockets/legacy/protocol.py", line 1216, in write_close_frame
await self.write_frame(True, OP_CLOSE, data, _state=State.CLOSING)
File "/Users/johnmccrone/miniforge3/envs/wside/lib/python3.10/site-packages/websockets/legacy/protocol.py", line 1188, in write_frame
self.write_frame_sync(fin, opcode, data)
File "/Users/johnmccrone/miniforge3/envs/wside/lib/python3.10/site-packages/websockets/legacy/protocol.py", line 1159, in write_frame_sync
frame.write(
File "/Users/johnmccrone/miniforge3/envs/wside/lib/python3.10/site-packages/websockets/legacy/framing.py", line 146, in write
write(self.new_frame.serialize(mask=mask, extensions=extensions))
File "/Users/johnmccrone/miniforge3/envs/wside/lib/python3.10/asyncio/selector_events.py", line 900, in write
raise RuntimeError('Cannot call write() after write_eof()')
RuntimeError: Cannot call write() after write_eof()
I agree: this shouldn't happen.
Here's what I can infer from the stack trace.
future: <Task finished name='Task-52' coro=<WebSocketCommonProtocol.close() done, defined at websockets/legacy/protocol.py:725> exception=RuntimeError('Cannot call write() after write_eof()')>
This is a task created by this code in WebSocketServer.close():
close_tasks = [
asyncio.create_task(websocket.close(1001))
for websocket in self.websockets
if websocket.state is not State.CONNECTING
]
Indeed, this is the only place in websockets that creates a task running WebSocketCommonProtocol.close() and it matches what you're describing.
When close() starts running:
state is State.OPEN, else it would return immediately and it wouldn't reachwrite_close_frame.write_eof()was called on the transport, or something having the same effect, as reporting byRuntimeErrorthat occurs before yielding control to the event loop.
Now I'd like to understand when write_eof() is called.
- websockets calls it from
WebSocketCommonProtocol.close_connection(). This could be the culprit if you cancel the connection handlers just before shutting down the server. Are you doing that? Perhaps you can share the code that handles SystemExit? In that case, the answer is "don't do that, websockets takes care of canceling the connection handlers for you". - asyncio calls it. However, I looked at asyncio and didn't find any place where this function is called and could be the culprit. (It's called three times in the entire library and none of them are likely to be happening here.)
Unclear that it's a bug after all.
Thanks @aaugustin for your response. I am not cancelling the connection handler preemptively, but there is a common pattern in my code to catch CancelledError in many of the coroutines and do some cleanup. Based off what you are saying, it seems likely the library wants the CancelledError to propagate instead of being handled. I am thinking right now that whatever timing issue I'm hitting is probably unlikely to happen if I just reraise CancelledError instead of calling break or wait_closed. The 3 second timeouts here are not being hit just to add some more details about what is happening.
Here is the relevant section of the code handling SystemExit:
async def stop(self):
async def close_frame_server(
ws_frm_server: websockets.WebSocketServer,
):
ws_frm_server.close()
await ws_frm_server.wait_closed()
close_task = asyncio.create_task(
close_frame_server(self._ws_frame_server),
)
_, pending_set = await asyncio.wait(
(
close_task,
),
timeout=3,
)
and the handler:
async def ws_frame_handler(
self,
connection: websockets.WebSocketServerProtocol,
):
while True:
try:
data = await connection.recv()
await self._queue_cmd.put(json.loads(data))
except (asyncio.CancelledError, KeyboardInterrupt):
async def wait_close(
connection: websockets.WebSocketServerProtocol,
):
await connection.wait_closed()
wait_close_task = asyncio.create_task(
wait_close(connection),
)
await asyncio.wait(
(wait_close_task,),
timeout=3,
)
break
except Exception as err:
break
Indeed, I believe that you should always re-raise asyncio.CancelledError when you catch it, else all bets are off. (This is why it no longer inherits Exception — to prevent accidentally swallowing it in an except Exception statement.) Generally speaking, the less cancellation, the better...
It isn't clear to me how you ended up with this big try/except block, which doesn't appear to do anything very useful — it waits up to three seconds for the connection to close... before closing it anyway (which happens when you exit the connection handler).
The library takes cares of closing connections. You don't have to think about it. I believe you would get the right behavior with this:
async def ws_frame_handler(
self,
connection: websockets.WebSocketServerProtocol,
):
while True:
data = await connection.recv()
await self._queue_cmd.put(json.loads(data))
And if you need other cleanup:
async def ws_frame_handler(
self,
connection: websockets.WebSocketServerProtocol,
):
try:
while True:
data = await connection.recv()
await self._queue_cmd.put(json.loads(data))
finally:
... # cleanup (other than closing the connection, handled automatically) goes here
Thanks @aaugustin for your response. The historical reasoning why the try/except block has the added complexity is previously it just did a break which ran into some other issue with a different RuntimeError exception being thrown with I believe the same Cannot call write() after write_eof() reason. This block lowered the frequency of the issue. It is clear now that the behavior was not correct in either case (it should not catch the exception at all).
I am unsure as to the original reason why the exception was caught at all in the first place, but it may be related to not wanting to catch the exception in the Exception block which I simplified in my previous comment (but as you point out this makes no sense anymore in the version of python that is being used).
Thank you for looking into what I thought might of been a bug (and I am happy to see it is not a bug). I think your explanations will also be helpful to others in the off chance someone else runs into a similar issue.