aiodns icon indicating copy to clipboard operation
aiodns copied to clipboard

Could not contact DNS servers

Open quantitative-technologies opened this issue 8 months ago • 9 comments

I'm not sure if this is new issue or related to #124.

But I have this issue only when I use mullvad VPN and not when I don't use VPN. For example:

aiohttp.client_exceptions.ClientConnectorDNSError: Cannot connect to host api.huobi.pro:443 ssl:default [Could not contact DNS servers]

There does not seem to be any issue with the VPN, and to fix the error I uninstalled aiodns from the virtual env.

Can you retest with 3.3.0 now that #124 is resolved?

bdraco avatar May 05 '25 08:05 bdraco

Actually I can no longer reproduce the issue: I see that aiodns 3.2.0 is currently installed in the venv I'm using, and I've not seen this again.

Hi - We have what appears to be this issue in HomeAssistant which uses v3.3.0. Could it be related?

RedNo7 avatar Jun 04 '25 09:06 RedNo7

To help us investigate the issue, please provide clear reproduction steps and enough detail for us to reliably trigger the problem.

bdraco avatar Jun 04 '25 09:06 bdraco

It is not easy to reproduce the bug on demand. I don't experience failure persistently. Version and met.no are best candidates with most isuues, from my point of view. Sensors, using the scrape or Restfull integrations, are not affected. https://github.com/home-assistant/core/issues/145708

Paja-git avatar Jun 04 '25 12:06 Paja-git

@bdraco I tried to mitigate the potential DNS server relations and changed the DNS via resolv.conf to point the 8.8.8.8#53. No help. What I noticed: "problematic" names (data-v2.hacs.xyz, registry.hub.docker.com...) have more than one A record and IPv6 addresses assigned. Could not be this a case?

Paja-git avatar Jun 06 '25 08:06 Paja-git

Thanks for the update. Unfortunately, we won’t be able to take further action without a reliable way to reproduce the issue. If you’re able to provide a reproducer, we’ll be happy to investigate.

bdraco avatar Jun 06 '25 09:06 bdraco

Hey @bdraco!

I recently ran into this obscure issue under some weird circumstances.

I was deploying a docker container based on alpine to my raspberry pi 5 (Debian GNU/Linux 12 (bookworm) aarch64) and started seeing this issue, which is not present when I try in my local machine (Arch Linux x86_64) nor when I uninstall aiodns.

My current guess is that maybe the architecture difference is uncovering a bug?

I have tried downgrading aiodns all the way down to 3.2.0, which is the first version to work properly.

Full traceback
E 2025-06-08 21:49:27,741 hikari.event_manager: an exception occurred handling an event (StartingEvent)
Traceback (most recent call last):
  File "/app/.venv/lib/python3.13/site-packages/aiohttp/resolver.py", line 117, in resolve
    resp = await self._resolver.getaddrinfo(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<5 lines>...
    )
    ^
aiodns.error.DNSError: (12, 'Timeout while contacting DNS servers')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/.venv/lib/python3.13/site-packages/aiohttp/connector.py", line 1512, in _create_direct_connection
    hosts = await self._resolve_host(host, port, traces=traces)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.13/site-packages/aiohttp/connector.py", line 1128, in _resolve_host
    return await asyncio.shield(resolved_host_task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.13/site-packages/aiohttp/connector.py", line 1159, in _resolve_host_with_throttle
    addrs = await self._resolver.resolve(host, port, family=self._family)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.13/site-packages/aiohttp/resolver.py", line 126, in resolve
    raise OSError(None, msg) from exc
OSError: [Errno None] Timeout while contacting DNS servers

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/.venv/lib/python3.13/site-packages/hikari/impl/rest.py", line 850, in _perform_request
    response = await self._client_session.request(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<9 lines>...
    )
    ^
  File "/app/.venv/lib/python3.13/site-packages/aiohttp/client.py", line 768, in _request
    resp = await handler(req)
           ^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.13/site-packages/aiohttp/client.py", line 723, in _connect_and_send_request
    conn = await self._connector.connect(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        req, traces=traces, timeout=real_timeout
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/app/.venv/lib/python3.13/site-packages/aiohttp/connector.py", line 622, in connect
    proto = await self._create_connection(req, traces, timeout)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.13/site-packages/aiohttp/connector.py", line 1189, in _create_connection
    _, proto = await self._create_direct_connection(req, traces, timeout)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.13/site-packages/aiohttp/connector.py", line 1518, in _create_direct_connection
    raise ClientConnectorDNSError(req.connection_key, exc) from exc
aiohttp.client_exceptions.ClientConnectorDNSError: Cannot connect to host discord.com:443 ssl:default [Timeout while contacting DNS servers]

The above exception was the direct cause of the following exception:

[REMOVED FOR BREVITY]
List of packages
Package          Version
---------------- -----------
aiodns           3.4.0
aiohappyeyeballs 2.6.1
aiohttp          3.11.18
aiosignal        1.3.2
async-timeout    5.0.1
asyncpg          0.30.0
attrs            25.3.0
brotli           1.1.0
cffi               1.17.1
ciso8601         2.3.2
colorlog         6.9.0
confspec         0.0.3
croniter         6.0.0
dateparser       1.2.1
frozenlist       1.6.2
hikari           2.3.3
hikari-lightbulb 3.0.0
idna             3.10
linkd            0.0.7
msgspec          0.19.0
multidict        6.4.4
orjson           3.10.18
propcache        0.3.1
pycares          4.8.0
pycparser        2.22
python-dateutil  2.9.0.post0
pytz             2025.2
regex            2024.11.6
ruamel-yaml      0.18.13
ruamel-yaml-clib 0.2.12
ruff              0.11.13
six              1.17.0
tzlocal          5.3.1
yarl             1.20.0

I can very consistently reproduce this (99/100 times; for some reason there is some case in which I restart the docker service and the first time it might work, but then never after that). So I would be happy to help with tracking this down if I can lend a hand :)

davfsa avatar Jun 08 '25 22:06 davfsa

The only time I’ve been able to reproduce this is when something creates hundreds of DNSResolver objects and runs the system out of resources.

If someone has a system in a state where they can reproduce the issue, use objgraph or similar to check how many DNSResolver objects are in memory. If you find hundreds of them it’s likely you have a leak in your code somewhere causing resource exhaustion.

bdraco avatar Jun 12 '25 12:06 bdraco