docker-red-discordbot icon indicating copy to clipboard operation
docker-red-discordbot copied to clipboard

restart: on-failure (or unless-stopped) not honored when aiohttp exception raised (Temporary failure in name resolution)

Open lukeomatik opened this issue 4 months ago • 1 comments

Sorry for closed and reopened issue, I fucked up.

Describe the bug

Shutting down from unhandled exception does not trigger any container restart as specified in my docker compose file (restart: unless-stopped)

In my case I received a: aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host discord.com:443 ssl:default [Temporary failure in name resolution]

This is a piece of log:

[2025-11-20 19:20:45] [CRITICAL] red.main: The main bot task didn't handle an exception and has crashed
Traceback (most recent call last):
  File "/data/venv/lib/python3.11/site-packages/aiohttp/connector.py", line 1203, in _create_direct_connection
    hosts = await self._resolve_host(host, port, traces=traces)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/venv/lib/python3.11/site-packages/aiohttp/connector.py", line 880, in _resolve_host
    return await asyncio.shield(resolved_host_task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/venv/lib/python3.11/site-packages/aiohttp/connector.py", line 917, in _resolve_host_with_throttle
    addrs = await self._resolver.resolve(host, port, family=self._family)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/venv/lib/python3.11/site-packages/aiohttp/resolver.py", line 33, in resolve
    infos = await self._loop.getaddrinfo(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "uvloop/loop.pyx", line 1529, in getaddrinfo
socket.gaierror: [Errno -3] Temporary failure in name resolution

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/data/venv/lib/python3.11/site-packages/redbot/__main__.py", line 470, in red_exception_handler
    red_task.result()
  File "/data/venv/lib/python3.11/site-packages/redbot/__main__.py", line 369, in run_bot
    await red.start(token)
  File "/data/venv/lib/python3.11/site-packages/redbot/core/bot.py", line 1298, in start
    await self.login(token)
  File "/data/venv/lib/python3.11/site-packages/discord/client.py", line 675, in login
    data = await self.http.static_login(token)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/venv/lib/python3.11/site-packages/discord/http.py", line 839, in static_login
    data = await self.request(Route('GET', '/users/@me'))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/venv/lib/python3.11/site-packages/discord/http.py", line 654, in request
    async with self.__session.request(method, url, **kwargs) as response:
  File "/data/venv/lib/python3.11/site-packages/aiohttp/client.py", line 1197, in __aenter__
    self._resp = await self._coro
                 ^^^^^^^^^^^^^^^^
  File "/data/venv/lib/python3.11/site-packages/aiohttp/client.py", line 581, in _request
    conn = await self._connector.connect(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/venv/lib/python3.11/site-packages/aiohttp/connector.py", line 544, in connect
    proto = await self._create_connection(req, traces, timeout)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/venv/lib/python3.11/site-packages/aiohttp/connector.py", line 944, in _create_connection
    _, proto = await self._create_direct_connection(req, traces, timeout)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/venv/lib/python3.11/site-packages/aiohttp/connector.py", line 1209, in _create_direct_connection
    raise ClientConnectorError(req.connection_key, exc) from exc
aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host discord.com:443 ssl:default [Temporary failure in name resolution]
[2025-11-20 19:20:45] [WARNING] red.main: Attempting to die as gracefully as possible...
[2025-11-20 19:20:45] [INFO] red.main: Shutting down from unhandled exception

The container stucks here. I needed to restart it manually.

Run command

services:
  red-discordbot:
    image: phasecorex/red-discordbot:extra-audio
    container_name: red-discordbot
    restart: unless-stopped

    env_file:
    - redacted

    environment:
      PUID: 1000
      PGID: 1000
      PREFIX: .
      EXTRA_ARGS: --dev --debug

    volumes:
    - redacted:/data


Environment info: Archlinux

Additional context My internet access went down. Red-discordbot container crashed with that uncaught exception, container restart not triggered. When internet restored the bot was stuck at the uncaught exception. Had to manually restart.

lukeomatik avatar Nov 20 '25 21:11 lukeomatik

If it's not restarting after a crash, then the Redbot process isn't completely stopping.

When I get time (maybe Saturday) I'll add a log message that will show when the Redbot process stops and then whether it will quit normally or internally restart (the [p]restart command handler). If you see the quit log, then it'll quit completely and then docker will restart it if configured that way.

PhasecoreX avatar Nov 21 '25 00:11 PhasecoreX

Forgot to respond to this earlier, but I did add the log messages. When you restart using the [p]restart command, you will see "Red-DiscordBot has requested a restart" in the logs. Otherwise, you will see "Red-DiscordBot has stopped with exit code 0". It will show 0 if all was good (normal shutdown), and some other number if not.

So in your case, if it crashes, you should see "Red-DiscordBot has stopped with exit code 1" (or some other non-zero number). At that point, if you have it configured as restart: on-failure or restart: unless-stopped, it will restart the entire container. If you don't see that log message, that means the Redbot process hasn't actually stopped, and I can't really do anything about that.

PhasecoreX avatar Dec 13 '25 17:12 PhasecoreX