firezone DNS fails to be controlled and/or reverted

Received two reports from a customer about DNS issues on Windows related to Firezone.

https://firezonehq.slack.com/archives/C08KPQKJZKM/p1749053656472029
https://firezonehq.slack.com/archives/C08KPQKJZKM/p1747746982240599

Version 1.4.14

This is a tracking issue to investigate and increase the robustness of our DNS controlling logic to prevent issues going forward.

I suspect we may find clues in Sentry.

Jun 04 '25 17:06 jamilbk

This is reproducible for me on Windows 10:

Connect Firezone
Update the DNS server of the primary interface adapter to manual, set it to 1.1.1.1
Observe all network connectivity is cut

Jun 05 '25 04:06 jamilbk

Logs from the above session

firezone_logs_2025_06_05-04-52.zip

Jun 05 '25 04:06 jamilbk

Edition	Windows 10 Pro
Version	22H2
Installed on	‎1/‎30/‎2025
OS build	19045.5917
Experience	Windows Feature Experience Pack 1000.19061.1000.0

Jun 05 '25 05:06 jamilbk

Can't seem to reproduce the above on Windows 11 Pro.

Jun 05 '25 05:06 jamilbk

Logs from the above session

firezone_logs_2025_06_05-04-52.zip

Can you reproduce with debug logs please?

Jun 05 '25 06:06 thomaseizinger

firezone_logs_2025_06_05-07-12.zip

Jun 05 '25 07:06 jamilbk

In the above logs, the 1.1.1.1 address makes sense, because I set the addresses to Primary: 1.1.1.1 and Secondary: 1.0.0.1

However, the 168.63.129.16 address doesn't make any sense. Not sure where that's coming from.

Strangely, I'm able to reproduce the connectivity hang without the Firezone GUI running too, maybe the tunnel service is at play, or maybe this could be related to the VM and RDP.

It's very possible the issues users are facing here is due to #8439. Maybe Windows is reporting to us the DNS servers of other interfaces but with a higher metric, expecting us not to use them, but we are.

All other apps on windows use the "Primary" resolver unless it fails. I believe we round-robin among all the ones we "find", which could the explain the things here.

Jun 05 '25 07:06 jamilbk

All other apps on windows use the "Primary" resolver unless it fails. I believe we round-robin among all the ones we "find", which could the explain the things here.

We don't round-robin anything, we map 1 to 1 and the OS picks which one to send queries to.

Jun 07 '25 14:06 thomaseizinger

When we set our DNS servers though, do we respect the metric of the old ones?

Jun 07 '25 15:06 jamilbk