channels_rabbitmq icon indicating copy to clipboard operation
channels_rabbitmq copied to clipboard

Silently eat asyncio.exceptions.CancelledError?

Open ShaheedHaque opened this issue 5 months ago • 6 comments

Hi,

When I use channels_rabbitmq to support Websockets, the browser at the client end can of course navigate away at any time without warning. When it does, the try-except block around:

https://github.com/CJWorkbench/channels_rabbitmq/blob/1d18c4c079fa5c0ba59b74437274b18f83f6c7ed/channels_rabbitmq/reader.py#L32

exit and the higher layers report this:

2025-06-18 08:52:19,213 [ERROR] carehare._consume_channel: Closing consumer
Traceback (most recent call last):
  File "/home/ubuntu/venv/lib/python3.12/site-packages/channels_rabbitmq/reader.py", line 32, in consume_into_multi_queue_until_connection_close
    body, delivery_tag = await consumer.next_delivery()
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/venv/lib/python3.12/site-packages/carehare/_consume_channel.py", line 196, in next_delivery
    return await _next_delivery(self._queue, self.closed)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/venv/lib/python3.12/site-packages/carehare/_consume_channel.py", line 39, in _next_delivery
    done, pending = await asyncio.wait(
                    ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/asyncio/tasks.py", line 464, in wait
    return await _wait(fs, timeout, return_when, loop)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/asyncio/tasks.py", line 550, in _wait
    await waiter
asyncio.exceptions.CancelledError

Now the said exception handler looks like this:

except carehare.ConnectionClosed:
        pass

Would it be reasonable to similarly eat asyncio.exceptions.CancelledError? It would certainly get rid of these Tracebacks from my logs!

ShaheedHaque avatar Jun 18 '25 09:06 ShaheedHaque

Who is calling cancel, and why?

adamhooper avatar Jun 18 '25 12:06 adamhooper

I had (lazily) assumed the networking socket closed unexpectedly.

ShaheedHaque avatar Jun 18 '25 13:06 ShaheedHaque

You'll need to figure out where the cancel is coming from. The function you point to is the "reconnect forever" function that connects the server to RabbitMQ. Disconnects shouldn't cancel it, and close works without cancel.

adamhooper avatar Jun 18 '25 22:06 adamhooper

Thanks for explaining. I'll be back when I know more.

ShaheedHaque avatar Jun 19 '25 03:06 ShaheedHaque

I found the trigger for this condition. It only happens on our production servers where we run Django "under" gunicorn, and we have gunicorn configured to rotate the Python code after some number of requests. Thus, when gunicorn records this:

[2025-06-26 13:54:35 +0000] [1374160] [WARNING] Maximum request limit of 109 exceeded. Terminating process.
[2025-06-26 13:54:35 +0000] [1374160] [INFO] Shutting down
[2025-06-26 13:54:35 +0000] [1374160] [INFO] Error while closing socket [Errno 9] Bad file descriptor
[2025-06-26 13:54:35 +0000] [1374160] [INFO] connection closed 
...

we can see the timestamp of 13:54:35 is close match for:

2025-06-26 13:54:35,202 [ERROR] carehare._consume_channel: Closing consumer
Traceback (most recent call last):
  File "/home/ubuntu/venv/lib/python3.12/site-packages/channels_rabbitmq/reader.py", line 32, in consume_into_multi_queue_until_connection_close
    body, delivery_tag = await consumer.next_delivery()
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/venv/lib/python3.12/site-packages/carehare/_consume_channel.py", line 196, in next_delivery
    return await _next_delivery(self._queue, self.closed)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/venv/lib/python3.12/site-packages/carehare/_consume_channel.py", line 39, in _next_delivery
    done, pending = await asyncio.wait(
                    ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/asyncio/tasks.py", line 464, in wait
    return await _wait(fs, timeout, return_when, loop)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/asyncio/tasks.py", line 550, in _wait
    await waiter
asyncio.exceptions.CancelledError
2025-06-26 13:54:35,442 [INFO] ...something else...

I also looked further into the CancelledError, and if it is decided that eating this exception is a good thing to do, then it would be worth considering the documentation at https://docs.python.org/3/library/asyncio-task.html#task-cancellation, and especially the bit about swallowing this exception. IIUC, I think it should be safe in this context because consume_into_multi_queue_until_connection_close is effectively at the top level (and anyway, the process is about to die).

ShaheedHaque avatar Jun 26 '25 14:06 ShaheedHaque

Good sleuthing!

Now ... Still, what is calling cancel()? You can (and should) shutdown without it. Two reasons:

  • Presumably you still have open Websockets connections. You should close those so the client knows to reconnect.
  • At any given moment you may have pending deliveries en route to RabbitMQ. You shouldn't drop those.

Search for "graceful shutdown" ... It's a whole world.

adamhooper avatar Jun 26 '25 19:06 adamhooper