Resolver uses DNS Servers from inactive interfaces on Windows, causing timeouts
dnspython Version: 2.7.0 Platform: Microsoft Windows 11 24H2 Python Version: 3.11.2
Summary
When using the default resolver (dns.resolver.resolve() or dns.resolver.Resolver()) on Windows, dnspython appears to gather DNS server configurations from the registry for all network interfaces, including those that are currently inactive (e.g., disconnected USB Ethernet adapters, disabled adapters). It then attempts to use these DNS servers during resolution. If the DNS servers associated with the inactive interface are unreachable (which they typically are when the interface is down), dnspython experiences significant delays due to query timeouts and retries before eventually falling back to servers configured on active interfaces. This results in unexpectedly long DNS resolution times within Python applications.
Observed Behavior
- A Python script using
dns.resolver.resolve()experiences significant delays (e.g., ~4 seconds) for DNS lookups, leading to 16 second startup delay for some applications. cProfileanalysis shows the time is spent withindns.resolver.resolve, ultimately waiting inselect.select, with individual internal query attempts timing out after ~1.3 seconds each.- Printing
dns.resolver.get_default_resolver().nameserversreveals a list containing DNS server IP addresses that are configured on an inactive network interface (e.g., associated with a disconnected USB Ethernet adapter/docking station). These servers were confirmed to be present in the registry key for that interface (HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\Interfaces\{GUID}). - Using system tools like
nslookupto explicitly query these specific DNS server IPs results in timeouts, confirming they are unreachable when the associated interface is inactive. - When the specific network interface associated with the problematic DNS servers is activated (e.g., connecting the USB adapter), the delay in the
dnspythonscript disappears, and resolution becomes fast. - System tools like
nslookupusing default system settings may resolve quickly even when the interface is inactive, suggesting they might handle fallback from inactive interface DNS servers differently or faster thandnspython.
Expected Behavior
dnspython's default resolver configuration on Windows should ideally:
- Prioritize DNS servers associated with currently active (OperationalStatus="Up") network interfaces.
- Avoid attempting to query DNS servers associated with inactive network interfaces, or fail very quickly on them without long timeouts/retries.
- The list returned by
dns.resolver.get_default_resolver().nameserversshould primarily reflect servers usable by the system at that moment.
Diagnosis / Root Cause Analysis
dnspythonappears to correctly read DNS server configuration from the Windows registry, likely iterating through keys underHKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\Interfaces\{GUID}.- However, it does not seem to correlate this configuration data with the current operational status of the corresponding network interface
{GUID}before adding theNameServerorDhcpNameServerentries to its internal list of servers to query. - During resolution,
dnspythonsequentially attempts to query servers from this combined list. When it attempts to query the unreachable servers belonging to the inactive interface, it waits for the standard DNS query timeout, possibly retries, and only then moves to the next server in the list. - This timeout/retry cycle for each unreachable server configured on inactive interfaces causes the cumulative observed delay.
Steps to Reproduce (Conceptual)
- On a Windows machine, configure a secondary network adapter (e.g., USB Ethernet, second physical NIC, possibly a persistent virtual adapter) with specific DNS server IPs (e.g.,
10.50.10.50,10.50.50.50as in the user's case, or any IP unreachable when the adapter is inactive). Note the adapter's GUID from the registry keyHKLM\SYSTEM\CCS\Services\Tcpip\Parameters\Interfaces\{GUID}where these IPs are set. - Ensure this specific network adapter is inactive (e.g., disconnected, disabled in
ncpa.cpl). - Ensure at least one other primary network adapter (e.g., Wi-Fi, main Ethernet) is active and configured with working DNS servers.
- Run a simple Python script:
import dns.resolver import time import platform if platform.system() != 'Windows': print("This test is designed for Windows.") else: print(f"dnspython default servers: {dns.resolver.get_default_resolver().nameservers}") # Should show servers from both active and inactive interfaces print("Starting lookup...") start_time = time.monotonic() try: # Use a common hostname unlikely to be in local cache initially result = dns.resolver.resolve('google.com', 'A') # Or use the SRV/TXT records from the original scenario if applicable # result_srv = dns.resolver.resolve('_mongodb._tcp.cluster0.acvlhai.mongodb.net', 'SRV') # result_txt = dns.resolver.resolve('cluster0.acvlhai.mongodb.net', 'TXT') duration = time.monotonic() - start_time print(f"Lookup successful in {duration:.2f} seconds.") # print(result.rrset) except Exception as e: duration = time.monotonic() - start_time print(f"Lookup failed after {duration:.2f} seconds: {e}") - Observe: The script execution time for the
resolve()call will be significantly longer (multiple seconds) than expected for a normal DNS lookup, reflecting the timeouts caused by querying servers on the inactive interface. - Activate the network adapter configured in Step 1 (e.g., plug in the USB adapter).
- Re-run the Python script.
- Observe: The script execution time for
resolve()should now be very short (sub-second).
Suggested Fix / Improvement
Consider modifying the Windows-specific DNS configuration logic within dnspython (potentially in dns.platform.windows or called by dns.resolver.Resolver._config_resolver) to check the operational status of the network interface associated with a set of DNS servers before adding them to the usable list. This could involve using Windows APIs like GetAdaptersAddresses (checking OperStatus) or WMI queries (Win32_NetworkAdapter NetConnectionStatus) or potentially parsing PowerShell Get-NetAdapter output (though native APIs are preferable). Servers associated with interfaces not currently Up should probably be excluded or placed at the very end of the list with minimal retry/timeout settings.
Workaround
A functional workaround is to explicitly configure the dnspython resolver to use specific known-good DNS servers, bypassing the problematic system configuration discovery:
resolver = dns.resolver.Resolver(configure=False)
resolver.nameservers = ['192.168.107.1'] # Example: Using only a known-good local server
# OR
# resolver.nameservers = ['8.8.8.8', '1.1.1.1'] # Example: Using public DNS
# Use resolver.resolve(...) instead of dns.resolver.resolve(...)
Another workaround involves monkeypatching Resolver._config_resolver to filter the discovered nameservers based on interface status checked via PowerShell, but this is fragile and not recommended for production.
If you install the WMI module, e.g. pip install 'dnspython[wmi]' then dnspython will use WMI and possibly have better results.
Or pip install wmi if you already installed dnspython.
Hey Bob. Thanks for that advice. I've confirmed that supplying the wmi extra does in fact work around the issue. Unfortunately, it really doesn't get to the root of the problem, which is that out-of-the-box, the default behavior is unacceptable, and because it's a low-level function (resolving DNS), it's not something that consumers higher in the dependency tree are really equipped to address. For example, my issue was in pip-run, which depends on coherent.deps, which depends on jaraco.mongodb, which depends on pymongo, which uses dnspython. Because they don't declare dnspython[wmi], coherent.deps (and its dependencies like pip-run) get the undesirable behavior.
Have you considered that the wmi behavior should be installed by default (on Windows, obviously)? Otherwise, it becomes the responsibility of every consumer downstream to be aware of the issue and resolve it in their environment.
I guess I could add the dependency to jaraco.mongodb or maybe submit a bug request to pymongo, but that will only fix the issue for that scope of the problem. It would be much better if the out-of-the-box behavior was generally usable.
We are considering making WMI the default, but have been waiting for feedback as it's a relatively new thing and the dnspython authors are not windows experts. So far it hasn't been causing problems, and in some cases fixes things.
We (pymongo) are seeing this issue as well, and would be interested in contributing a fix to dnspython itself.
@jaraco @rthalley would you be okay with us submitting a patch that includes Jason's workaround here, and testing it in our CI?
I see from the pymongo ticket that wmi seems to have been abandoned, which is unfortunate. Is there no windows way of getting the active interfaces from python or python-calling-C code, as running powershell and processing its output is something I'd REALLY rather not do.
Through some digging, I found the GitHub repo: https://github.com/tjguk/wmi. I see a few options:
- Vendor
wmi.pyindnspythonwith attribution (it is MIT licensed). - Try and strip down just the parts of
wmi.pythat are used bydnspythonand keep that as vendored file. - Add an option to the resolver to use the powershell lookup for users that are willing to use it.
- Add an option to the resolver to bring your own lookup mechanism.
Another consideration about using WMI - I hadn't realized at the time that the wmi module is based on pywin32, so requires a binary install (non-pure Python), which is non-viable in some environments (vendored packages, pre-release Python versions, etc).
running powershell and processing its output is something I'd REALLY rather not do.
I get that. It's a little messy, but it might be the most robust means to access interface details, and often a Powershell command can be made to emit reliable output (e.g. JSON).
Another option might be to call the wmic command and parse its output.
That said, I do believe a direct API would be better. I've contributed several modules that utilize ctypes to call into Windows APIs (e.g. keyring, pywin32-ctypes, jaraco.windows, ...). Having a direct call into Windows APIs would be best as it would be low overhead (and could potentially be queried dynamically to respond to changes in the environment).
In the past, I've used netifaces with success, but it's also abandoned and non-pure.
I asked Gemini to generate something for me, and it came up with a plausible implementation, but I was unsuccessful in getting it to work. It probably just needs someone to read through the API spec and massage the implementation to pull out the right values, but it'll require some real intelligence.
I don't really want to vendor WMI as then I have to fix that too, and I am not a windows programmer nor do I want to learn it. Something using ctypes would be ok if it were simple enough.
My issue with powershell is not so much the messiness as a desire to not be shelling out to anything, as that too will cause issues with some people's security environments.
I got the following code doing some similar vibe coding with our internal GPT, and it gives the same answer as when using wmi:
Adapter:
DNS Server: 10.122.0.2
With wmi:
>>> dns.resolver.get_default_resolver().nameservers
['10.122.0.2']
Without wmi:
>>> dns.resolver.get_default_resolver().nameservers
['10.122.0.2', '10.0.0.2']
I'm happy to do more testing and create a PR.
I did some more testing and manual validation of the ctypes used, and updated the script. The output is now:
Adapter: Ethernet 4, Status: Operational
DNS Server: 10.122.0.2
Adapter: Loopback Pseudo-Interface 1, Status: Operational
Skipping loopback adapter
I think it is ready to replace the usage of winreg. @rthalley, would you accept a PR to do that?
I would like this code, but I don't want to remove the other solutions just yet. Can you add your code to win32util.py, making a "getter" like the ones for wmi and the registry?
Will do!