restart: on-failure (or unless-stopped) not honored when aiohttp exception raised (Temporary failure in name resolution)
Sorry for closed and reopened issue, I fucked up.
Describe the bug
Shutting down from unhandled exception does not trigger any container restart as specified in my docker compose file (restart: unless-stopped)
In my case I received a: aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host discord.com:443 ssl:default [Temporary failure in name resolution]
This is a piece of log:
[2025-11-20 19:20:45] [CRITICAL] red.main: The main bot task didn't handle an exception and has crashed
Traceback (most recent call last):
File "/data/venv/lib/python3.11/site-packages/aiohttp/connector.py", line 1203, in _create_direct_connection
hosts = await self._resolve_host(host, port, traces=traces)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/venv/lib/python3.11/site-packages/aiohttp/connector.py", line 880, in _resolve_host
return await asyncio.shield(resolved_host_task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/venv/lib/python3.11/site-packages/aiohttp/connector.py", line 917, in _resolve_host_with_throttle
addrs = await self._resolver.resolve(host, port, family=self._family)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/venv/lib/python3.11/site-packages/aiohttp/resolver.py", line 33, in resolve
infos = await self._loop.getaddrinfo(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "uvloop/loop.pyx", line 1529, in getaddrinfo
socket.gaierror: [Errno -3] Temporary failure in name resolution
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/venv/lib/python3.11/site-packages/redbot/__main__.py", line 470, in red_exception_handler
red_task.result()
File "/data/venv/lib/python3.11/site-packages/redbot/__main__.py", line 369, in run_bot
await red.start(token)
File "/data/venv/lib/python3.11/site-packages/redbot/core/bot.py", line 1298, in start
await self.login(token)
File "/data/venv/lib/python3.11/site-packages/discord/client.py", line 675, in login
data = await self.http.static_login(token)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/venv/lib/python3.11/site-packages/discord/http.py", line 839, in static_login
data = await self.request(Route('GET', '/users/@me'))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/venv/lib/python3.11/site-packages/discord/http.py", line 654, in request
async with self.__session.request(method, url, **kwargs) as response:
File "/data/venv/lib/python3.11/site-packages/aiohttp/client.py", line 1197, in __aenter__
self._resp = await self._coro
^^^^^^^^^^^^^^^^
File "/data/venv/lib/python3.11/site-packages/aiohttp/client.py", line 581, in _request
conn = await self._connector.connect(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/venv/lib/python3.11/site-packages/aiohttp/connector.py", line 544, in connect
proto = await self._create_connection(req, traces, timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/venv/lib/python3.11/site-packages/aiohttp/connector.py", line 944, in _create_connection
_, proto = await self._create_direct_connection(req, traces, timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/venv/lib/python3.11/site-packages/aiohttp/connector.py", line 1209, in _create_direct_connection
raise ClientConnectorError(req.connection_key, exc) from exc
aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host discord.com:443 ssl:default [Temporary failure in name resolution]
[2025-11-20 19:20:45] [WARNING] red.main: Attempting to die as gracefully as possible...
[2025-11-20 19:20:45] [INFO] red.main: Shutting down from unhandled exception
The container stucks here. I needed to restart it manually.
Run command
services:
red-discordbot:
image: phasecorex/red-discordbot:extra-audio
container_name: red-discordbot
restart: unless-stopped
env_file:
- redacted
environment:
PUID: 1000
PGID: 1000
PREFIX: .
EXTRA_ARGS: --dev --debug
volumes:
- redacted:/data
Environment info: Archlinux
Additional context My internet access went down. Red-discordbot container crashed with that uncaught exception, container restart not triggered. When internet restored the bot was stuck at the uncaught exception. Had to manually restart.
If it's not restarting after a crash, then the Redbot process isn't completely stopping.
When I get time (maybe Saturday) I'll add a log message that will show when the Redbot process stops and then whether it will quit normally or internally restart (the [p]restart command handler). If you see the quit log, then it'll quit completely and then docker will restart it if configured that way.
Forgot to respond to this earlier, but I did add the log messages. When you restart using the [p]restart command, you will see "Red-DiscordBot has requested a restart" in the logs. Otherwise, you will see "Red-DiscordBot has stopped with exit code 0". It will show 0 if all was good (normal shutdown), and some other number if not.
So in your case, if it crashes, you should see "Red-DiscordBot has stopped with exit code 1" (or some other non-zero number). At that point, if you have it configured as restart: on-failure or restart: unless-stopped, it will restart the entire container. If you don't see that log message, that means the Redbot process hasn't actually stopped, and I can't really do anything about that.