netbird icon indicating copy to clipboard operation
netbird copied to clipboard

[bug][client] incorrect error for DNS in netbird status debug output

Open fboender opened this issue 7 months ago • 11 comments

Describe the problem

netbird status -d shows an error for nameservers, even though everything is working fine. This is confusing when trying to debug issues with DNS.

client $ netbird status -d
Nameservers: 
  [100.97.168.236:53] for [flusso.nl] is Unavailable, reason: 1 error occurred:
	* read udp 100.97.56.42:48671->100.97.168.236:53: i/o timeout

However, resolving directly against the peer on which the DNS runs works fine.

client $ dig +short @100.97.168.236 vpn-test.flusso.nl
100.97.94.224

The vpn-test.flusso.nl domain is non-public, and only configured in the self-hosted DNS server. So everything is working fine:

client $ curl -I http://vpn-test.flusso.nl 
HTTP/1.0 200 OK

When I restart netbird on the "client", it sometimes shows a different error message:

Nameservers: 
  [100.97.168.236:53] for [flusso.nl] is Unavailable, reason: 1 error occurred:
	* write udp 100.97.56.42:33407->100.97.168.236:53: write: required key not available

But once again, everything is just fine:

client $ curl -I http://vpn-test.flusso.nl 
HTTP/1.0 200 OK

To Reproduce

Not sure how to reproduce, but I've got the following setup (using Netbird Cloud):

I've got two peers. One, which I will call the "server" has a local DNS server configured. It's listening on the internal wt0 interface:

server $ netstat -panu | grep 100.97.168.236:53
udp        0      0 100.97.168.236:53       0.0.0.0:*                           1793604/dnsmasq  

Access policy is default (allow all traffic).

I've got the "server" peer (100.97.168.236 ) setup in the Netbird web interface under DNS -> Nameservers, with the distribution group set to "All". Under "DNS Settings" I've disabled DNS management for this peer.

The other peer is a "client", nothing special. This is the peer showing the issue.

Expected behavior

I expect the output of netbird status -d to show that everything is working with DNS, since it's obviously is. I was diagnosing some DNS problems earlier, and this error really set me in the wrong direction.

Are you using NetBird Cloud?

Using Netbird Cloud

NetBird version

$ netbird version
0.40.0

Is any other VPN software installed?

Yes, OpenVPN. But for the purpose of testing, I've stopped all openvpn processes.

Debug output

I can provide debug output / logging if required, please let me know.

Additional context

I strongly suspect this is happening because I'm using another peer's internal IP as a DNS server, and Netbird on start-up tries to probe the DNS before the communication between the peers is fully established.

Have you tried these troubleshooting steps?

  • [x] Checked for newer NetBird versions
  • [x] Searched for similar issues on GitHub (including closed ones)
  • [x] Restarted the NetBird client
  • [x] Disabled other VPN software
  • [x] Checked firewall settings

fboender avatar Apr 09 '25 13:04 fboender

Here's a portion of the Netbird client.log right after restarting netbird:

2025-04-09T15:00:00+02:00 INFO client/cmd/service_controller.go:24: starting Netbird service
2025-04-09T15:00:00+02:00 INFO client/cmd/service_controller.go:68: started daemon server: /var/run/netbird.sock
2025-04-09T15:00:00+02:00 INFO client/internal/connect.go:122: starting NetBird client version 0.40.0 on linux/amd64
2025-04-09T15:00:00+02:00 INFO util/net/env_linux.go:70: system supports advanced routing
2025-04-09T15:00:01+02:00 INFO client/internal/connect.go:255: connecting to the Relay service(s): rels://relay.netbird.io:443
2025-04-09T15:00:01+02:00 INFO relay/client/picker.go:72: try to connecting to relay server: rels://relay.netbird.io:443
2025-04-09T15:00:01+02:00 INFO [relay: rels://relay.netbird.io:443] relay/client/client.go:164: create new relay connection: local peerID: xhl1JnmEdd7Rc2oHolwDfW/fMhBzqlaB+P2a39FzV2E=, local peer hashedID: sha-oSRriDWgVR89VYskenHtozADFzdgSjMzQK9RIXnX5OA=
2025-04-09T15:00:01+02:00 INFO [relay: rels://relay.netbird.io:443] relay/client/client.go:170: connecting to relay server
2025-04-09T15:00:01+02:00 INFO [relay: rels://relay.netbird.io:443] relay/client/dialer/race_dialer.go:64: dialing Relay server via quic
2025-04-09T15:00:01+02:00 INFO [relay: rels://relay.netbird.io:443] relay/client/dialer/race_dialer.go:64: dialing Relay server via WS
2025-04-09T15:00:01+02:00 INFO [relay: rels://relay.netbird.io:443] relay/client/dialer/race_dialer.go:89: successfully dialed via: WS
2025-04-09T15:00:01+02:00 INFO [relay: rels://relay.netbird.io:443] relay/client/dialer/race_dialer.go:75: connection attempt aborted via: quic
2025-04-09T15:00:01+02:00 INFO [relay: rels://streamline-de-fra1-2.relay.netbird.io:443] relay/client/client.go:186: relay connection established
2025-04-09T15:00:01+02:00 INFO relay/client/picker.go:90: connected to Relay server: rels://relay.netbird.io:443
2025-04-09T15:00:01+02:00 INFO relay/client/picker.go:64: chosen home Relay server: rels://relay.netbird.io:443
2025-04-09T15:00:01+02:00 INFO client/iface/wgproxy/ebpf/proxy.go:91: local wg proxy listening on: 3128
2025-04-09T15:00:01+02:00 INFO client/iface/wgproxy/factory_kernel.go:29: WireGuard Proxy Factory will produce eBPF proxy
2025-04-09T15:00:01+02:00 INFO client/internal/routemanager/manager.go:193: Routing setup complete
2025-04-09T15:00:01+02:00 INFO client/firewall/create_linux.go:73: creating an nftables firewall manager
2025-04-09T15:00:01+02:00 INFO client/internal/dns/host_unix.go:54: System DNS manager discovered: systemd
2025-04-09T15:00:01+02:00 INFO client/internal/peer/guard/sr_watcher.go:106: reconnected to Signal or Relay server
2025-04-09T15:00:01+02:00 INFO signal/client/grpc.go:149: connected to the Signal Service stream
2025-04-09T15:00:01+02:00 INFO client/internal/engine.go:1672: Network monitor is disabled, not starting
2025-04-09T15:00:01+02:00 INFO client/internal/connect.go:281: Netbird engine started, the IP is: 100.97.56.42/16
2025-04-09T15:00:01+02:00 INFO management/client/grpc.go:156: connected to the Management Service stream
2025-04-09T15:00:02+02:00 INFO relay/client/manager.go:223: update relay server URLs: [rels://relay.netbird.io:443]
2025-04-09T15:00:02+02:00 WARN client/internal/engine.go:785: running SSH server is not permitted
2025-04-09T15:00:02+02:00 INFO client/internal/acl/manager.go:66: ACL rules processed in: 391.115µs, total rules count: 1
2025-04-09T15:00:02+02:00 INFO [peer: XrnRfAQv+ynDggto2DxXDgfqoN1LswWavr+z2Q1lOy0=] client/internal/peer/handshaker.go:79: wait for remote offer confirmation
2025-04-09T15:00:02+02:00 INFO [peer: MkZAbtB2M1R7f4FiTZKo/aVGkJOzUvW+PNndgzhCRHI=] client/internal/peer/handshaker.go:79: wait for remote offer confirmation
2025-04-09T15:00:02+02:00 INFO [peer: I4gHAjSOcKzHKMhvTyMifgYansTbFf7QbR1GZxFsOG0=] client/internal/peer/handshaker.go:79: wait for remote offer confirmation
2025-04-09T15:00:02+02:00 INFO [peer: mh/62If5Qj7spBegMH1q93Ku7HiROo4qG4dbDHaSwz8=] client/internal/peer/handshaker.go:79: wait for remote offer confirmation
2025-04-09T15:00:02+02:00 INFO client/internal/dns/systemd_linux.go:148: adding 1 search domains and 2 match domains. Search list: [netbird.cloud.] , Match list: [flusso.nl. 97.100.in-addr.arpa.]
2025-04-09T15:00:02+02:00 WARN client/internal/dns/upstream.go:233: probing upstream nameserver 100.97.168.236:53: write udp 100.97.56.42:55934->100.97.168.236:53: write: required key not available
2025-04-09T15:00:02+02:00 WARN client/internal/dns/upstream.go:324: Upstream resolving is Disabled for 30s
2025-04-09T15:00:02+02:00 INFO [nameservers: [{100.97.168.236 udp 53}]] client/internal/dns/server.go:726: Temporarily deactivating nameservers group due to timeout
2025-04-09T15:00:02+02:00 INFO client/internal/dns/systemd_linux.go:148: adding 1 search domains and 2 match domains. Search list: [netbird.cloud.] , Match list: [flusso.nl. 97.100.in-addr.arpa.]
2025-04-09T15:00:02+02:00 INFO [peer: I4gHAjSOcKzHKMhvTyMifgYansTbFf7QbR1GZxFsOG0=] client/internal/peer/conn.go:264: OnRemoteAnswer, priority: None, status ICE: Disconnected, status relay: Disconnected
2025-04-09T15:00:02+02:00 INFO [peer: I4gHAjSOcKzHKMhvTyMifgYansTbFf7QbR1GZxFsOG0=] client/internal/peer/handshaker.go:91: received connection confirmation, running version 0.39.2 and with remote WireGuard listen port 51820
2025-04-09T15:00:02+02:00 INFO [peer: I4gHAjSOcKzHKMhvTyMifgYansTbFf7QbR1GZxFsOG0=] client/internal/peer/handshaker.go:79: wait for remote offer confirmation
2025-04-09T15:00:02+02:00 INFO [relay: rels://streamline-de-fra1-2.relay.netbird.io:443] relay/client/client.go:214: open connection to peer: sha-SyFIvz38U9sl7PnA2fEIYew5bzHrJocCkorSvBa534E=
2025-04-09T15:00:02+02:00 INFO client/iface/wgproxy/ebpf/proxy.go:102: turn conn added to wg proxy store: rels://streamline-de-fra1-2.relay.netbird.io:443, endpoint port: :1
2025-04-09T15:00:02+02:00 INFO [peer: I4gHAjSOcKzHKMhvTyMifgYansTbFf7QbR1GZxFsOG0=] client/internal/peer/conn.go:466: created new wgProxy for relay connection: 127.0.0.1:1
2025-04-09T15:00:02+02:00 INFO [peer: I4gHAjSOcKzHKMhvTyMifgYansTbFf7QbR1GZxFsOG0=] client/internal/peer/wg_watcher.go:87: WireGuard watcher started
2025-04-09T15:00:02+02:00 INFO [peer: MkZAbtB2M1R7f4FiTZKo/aVGkJOzUvW+PNndgzhCRHI=] client/internal/peer/conn.go:264: OnRemoteAnswer, priority: None, status ICE: Disconnected, status relay: Disconnected
2025-04-09T15:00:02+02:00 INFO [peer: MkZAbtB2M1R7f4FiTZKo/aVGkJOzUvW+PNndgzhCRHI=] client/internal/peer/handshaker.go:91: received connection confirmation, running version 0.40.0 and with remote WireGuard listen port 51820
2025-04-09T15:00:02+02:00 INFO [peer: MkZAbtB2M1R7f4FiTZKo/aVGkJOzUvW+PNndgzhCRHI=] client/internal/peer/handshaker.go:79: wait for remote offer confirmation
2025-04-09T15:00:02+02:00 INFO [relay: rels://streamline-de-fra1-2.relay.netbird.io:443] relay/client/client.go:214: open connection to peer: sha-uZACWKW7KFKYCY3egZrdF8yOY23mkhVMeOVE2FZsjoU=
2025-04-09T15:00:02+02:00 INFO client/iface/wgproxy/ebpf/proxy.go:102: turn conn added to wg proxy store: rels://streamline-de-fra1-2.relay.netbird.io:443, endpoint port: :2
2025-04-09T15:00:02+02:00 INFO [peer: MkZAbtB2M1R7f4FiTZKo/aVGkJOzUvW+PNndgzhCRHI=] client/internal/peer/conn.go:466: created new wgProxy for relay connection: 127.0.0.1:2
2025-04-09T15:00:02+02:00 INFO [peer: MkZAbtB2M1R7f4FiTZKo/aVGkJOzUvW+PNndgzhCRHI=] client/internal/peer/wg_watcher.go:87: WireGuard watcher started
2025-04-09T15:00:02+02:00 INFO [peer: mh/62If5Qj7spBegMH1q93Ku7HiROo4qG4dbDHaSwz8=] client/internal/peer/conn.go:264: OnRemoteAnswer, priority: None, status ICE: Disconnected, status relay: Disconnected
2025-04-09T15:00:02+02:00 INFO [peer: mh/62If5Qj7spBegMH1q93Ku7HiROo4qG4dbDHaSwz8=] client/internal/peer/handshaker.go:91: received connection confirmation, running version 0.39.2 and with remote WireGuard listen port 51820
2025-04-09T15:00:02+02:00 INFO [peer: mh/62If5Qj7spBegMH1q93Ku7HiROo4qG4dbDHaSwz8=] client/internal/peer/handshaker.go:79: wait for remote offer confirmation
2025-04-09T15:00:02+02:00 INFO [relay: rels://streamline-de-fra1-2.relay.netbird.io:443] relay/client/client.go:214: open connection to peer: sha-sf1FGLNxVttZ0VJppF2lm6oSGJ/UPvaOjXFbKb8A+eM=
2025-04-09T15:00:02+02:00 INFO client/iface/wgproxy/ebpf/proxy.go:102: turn conn added to wg proxy store: rels://streamline-de-fra1-2.relay.netbird.io:443, endpoint port: :3
2025-04-09T15:00:02+02:00 INFO [peer: mh/62If5Qj7spBegMH1q93Ku7HiROo4qG4dbDHaSwz8=] client/internal/peer/conn.go:466: created new wgProxy for relay connection: 127.0.0.1:3
2025-04-09T15:00:02+02:00 INFO [peer: mh/62If5Qj7spBegMH1q93Ku7HiROo4qG4dbDHaSwz8=] client/internal/peer/wg_watcher.go:87: WireGuard watcher started
2025-04-09T15:00:02+02:00 INFO [peer: I4gHAjSOcKzHKMhvTyMifgYansTbFf7QbR1GZxFsOG0=] client/internal/peer/conn.go:500: start to communicate with peer via relay
2025-04-09T15:00:02+02:00 INFO [peer: MkZAbtB2M1R7f4FiTZKo/aVGkJOzUvW+PNndgzhCRHI=] client/internal/peer/conn.go:500: start to communicate with peer via relay
2025-04-09T15:00:02+02:00 INFO [peer: mh/62If5Qj7spBegMH1q93Ku7HiROo4qG4dbDHaSwz8=] client/internal/peer/conn.go:500: start to communicate with peer via relay

fboender avatar Apr 09 '25 13:04 fboender

The Nameservers are usually checked for the first time before the Peer connectivity is established and fail immediately, then the handler checks the connectivity only periodically. It can take anywhere from a few seconds to up to half a minute to be retried and properly register with the operating system.

Could it explain the behaviour you are observing?

nazarewk avatar Apr 09 '25 13:04 nazarewk

I checked the status every minute for 10 minutes, but it didn't change, even though I could resolve against the DNS server all that time:

$ for I in 1 2 3 4 5 6 7 8 9 10; do date; dig +short vpn-test.flusso.nl; netbird status -d  | grep -A2 "^Nameservers:"; sleep 60; done
Wed Apr  9 04:22:51 PM CEST 2025
100.97.94.224
Nameservers: 
  [100.97.168.236:53] for [flusso.nl] is Unavailable, reason: 1 error occurred:
	* write udp 100.97.56.42:55934->100.97.168.236:53: write: required key not available
Wed Apr  9 04:23:51 PM CEST 2025
100.97.94.224
Nameservers: 
  [100.97.168.236:53] for [flusso.nl] is Unavailable, reason: 1 error occurred:
	* write udp 100.97.56.42:55934->100.97.168.236:53: write: required key not available
---SNIP---
Wed Apr  9 04:30:53 PM CEST 2025
100.97.94.224
Nameservers: 
  [100.97.168.236:53] for [flusso.nl] is Unavailable, reason: 1 error occurred:
	* write udp 100.97.56.42:55934->100.97.168.236:53: write: required key not available
Wed Apr  9 04:31:53 PM CEST 2025
100.97.94.224
Nameservers: 
  [100.97.168.236:53] for [flusso.nl] is Unavailable, reason: 1 error occurred:
	* write udp 100.97.56.42:55934->100.97.168.236:53: write: required key not available

It's about half an hour later now, and still not okay:

$ date; dig +short vpn-test.flusso.nl; netbird status -d  | grep -A2 "^Nameservers:"; 
Wed Apr  9 05:01:04 PM CEST 2025
100.97.94.224
Nameservers: 
  [100.97.168.236:53] for [flusso.nl] is Unavailable, reason: 1 error occurred:
	* write udp 100.97.56.42:55934->100.97.168.236:53: write: required key not available

And just to prove that eveything is working properly and there's no connectivity issues between the peers:

$ date; dig +short @100.97.168.236 vpn-test.flusso.nl
Wed Apr  9 05:02:39 PM CEST 2025
100.97.94.224

fboender avatar Apr 09 '25 15:04 fboender

btw, I consider this a low priority issue. Everything's working just fine, and as long as it is, you'd probably never even see the debug output ever.

fboender avatar Apr 09 '25 15:04 fboender

I'm starting to wonder if it might have something to do with using Netbird IP address as a Nameserver.

Looks like a curious case, I will ask the team what the error could be about, because it isn't the usual i/o timeout

nazarewk avatar Apr 09 '25 15:04 nazarewk

The team noted that required key not available type of error usually means the ip address is not handled by any active wireguard peer.

Are you sure you are using the correct IP address, the peer is online, and the connection to it is established? It probably works on the rest of the system because NetBird client will deregister the failing nameserver from the operating system and not use it at all.

Would you be willing to share following with us? (you can mail to support at netbird io or send me directly to kdn on Slack if you don't want it available publicly)

  • a NetBird debug bundle https://docs.netbird.io/how-to/troubleshooting-client#debug-bundle
  • your routing table
  • your system's DNS configuration

nazarewk avatar Apr 09 '25 15:04 nazarewk

Thanks for your quick response!

I forgot to mention that I'm also running a Tinc VPN. For all of the following tests, I've turned it off.

The team noted that required key not available type of error usually means the ip address is not handled by any active wireguard peer.

The error message keeps changing everytime I restart netbird between:

  [100.97.168.236:53] for [flusso.nl] is Unavailable, reason: 1 error occurred:
	* write udp 100.97.56.42:49619->100.97.168.236:53: write: required key not available

And

  [100.97.168.236:53] for [flusso.nl] is Unavailable, reason: 1 error occurred:
	* read udp 100.97.56.42:48671->100.97.168.236:53: i/o timeout

I assume because, like you said, when probing the DNS peer at startup, it's not yet properly connected at the wireguard level, so it sets an error message, which never gets cleared. Depending on timing, the error can be i/o timeout or key not available.

Are you sure you are using the correct IP address, the peer is online, and the connection to it is established?

Connectivity to the DNS peer is definitely working. Here are the DNS settings from the web interface:

Image

In the netbird status -d output it's status is "Connected":

 web-intern.netbird.cloud:
  NetBird IP: 100.97.168.236
  Public key: mh/62If5Qj7spBegMH1q93Ku7HiROo4qG4dbDHaSwz8=
  Status: Connected
  -- detail --
  Connection type: P2P
  ICE candidate (Local/Remote): srflx/host
  ICE candidate endpoints (Local/Remote): 87.212.219.42:51820/95.217.210.101:51820
  Relay server address: rels://streamline-de-fra1-0.relay.netbird.io:443
  Last connection update: 15 seconds ago
  Last WireGuard handshake: 10 seconds ago
  Transfer status (received/sent) 604 B/1.6 KiB
  Quantum resistance: false
  Networks: -
  Latency: 33.822903ms

I can reach that peer just fine:

root @ hank /var/log/netbird $ ping 100.97.168.236
PING 100.97.168.236 (100.97.168.236) 56(84) bytes of data.
64 bytes from 100.97.168.236: icmp_seq=1 ttl=64 time=44.7 ms
64 bytes from 100.97.168.236: icmp_seq=2 ttl=64 time=37.4 ms

I can also resolve DNS queries against it directly:

$ dig  @100.97.168.236 vpn-test.flusso.nl

; <<>> DiG 9.18.28-0ubuntu0.24.04.1-Ubuntu <<>> @100.97.168.236 vpn-test.flusso.nl
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 3184
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;vpn-test.flusso.nl.		IN	A

;; ANSWER SECTION:
vpn-test.flusso.nl.	0	IN	A	100.97.94.224

;; Query time: 36 msec
;; SERVER: 100.97.168.236#53(100.97.168.236) (UDP)
;; WHEN: Wed Apr 09 17:58:43 CEST 2025
;; MSG SIZE  rcvd: 63

The SERVER: 100.97.168.236#53(100.97.168.236) (UDP) portion there shows it's resolving against the peer.

It probably works on the rest of the system because NetBird client will deregister the failing nameserver from the operating system and not use it at all.

The vpn-test.flusso.nl domainname is only registered in dnsmasq running on the peer I'm using as DNS. Anywhere else, the domain doesn't exist. If I stop Netbird, the domain can no longer be resolved:

✅ root @ hank /var/log/netbird $ dig +short vpn-test.flusso.nl
100.97.94.224
✅ root @ hank /var/log/netbird $ systemctl stop netbird.service 
✅ root @ hank /var/log/netbird $ dig +short vpn-test.flusso.nl
✅ root @ hank /var/log/netbird $ systemctl start netbird.service 
✅ root @ hank /var/log/netbird $ dig +short vpn-test.flusso.nl
100.97.94.224

So everything points to my system using the DNS on the peer just fine, despite the error message in debug status output.

a NetBird debug bundle https://docs.netbird.io/how-to/troubleshooting-client#debug-bundle your routing table your system's DNS configuration

The Netbird debug bundle will have to wait until tomorrow.

Routing table:

✅ root @ hank ~/Projects/flusso/netbird $ route -n -v
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.1.1     0.0.0.0         UG    600    0        0 wlp2s0
100.97.0.0      0.0.0.0         255.255.0.0     U     0      0        0 wt0
192.168.1.0     0.0.0.0         255.255.255.0   U     600    0        0 wlp2s0

Resolvectl status:

$ resolvectl status | cat
Global
         Protocols: -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
  resolv.conf mode: stub

Link 3 (enp1s0)
    Current Scopes: none
         Protocols: -DefaultRoute -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported

Link 4 (wlp2s0)
    Current Scopes: DNS
         Protocols: +DefaultRoute -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server: 192.168.1.1
       DNS Servers: 192.168.1.1
        DNS Domain: home

Link 72 (enx806d97190841)
    Current Scopes: none
         Protocols: -DefaultRoute -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported

Link 75 (wt0)
    Current Scopes: DNS
         Protocols: -DefaultRoute -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server: 100.97.56.42
       DNS Servers: 100.97.56.42
        DNS Domain: ~flusso.nl netbird.cloud ~97.100.in-addr.arpa

I would have expected the DNS settings for wt0 to be 100.97.168.236, but instead it's the local peer address. Not sure if that's normal. It's systemd-resolved that's listening there, which I guess is normal:

$ netstat -nlpu | grep ":53 "
udp        0      0 127.0.0.54:53           0.0.0.0:*                           1205/systemd-resolv 
udp        0      0 127.0.0.53:53           0.0.0.0:*                           1205/systemd-resolv 

If you require any other information, please let me know. I'll see if I can install the debug bundle tomorrow.

Thank you for your time!

fboender avatar Apr 09 '25 16:04 fboender

I would have expected the DNS settings for wt0 to be 100.97.168.236, but instead it's the local peer address.

Actually it is an address of the local NetBird resolver: a client's IP for linux or the highest free IP address in the 100.XXX.0.0/16 network (*.255.254) for other systems.

nazarewk avatar Apr 09 '25 16:04 nazarewk

I'm starting to wonder if it might have something to do with using Netbird IP address as a Nameserver.

Missed that one, but I agree. I don't want to jump to conclusions, but I agree with your earlier assessment:

The Nameservers are usually checked for the first time before the Peer connectivity is established and fail immediately, then the handler checks the connectivity only periodically. It can take anywhere from a few seconds to up to half a minute to be retried and properly register with the operating system.

It feels a lot like the handler correctly test / reestablishes connectivity, and all the DNS settings are configured just fine, but the debug status just isn't updated properly. But again, don't want to jump to conclusions. :-)

fboender avatar Apr 09 '25 16:04 fboender

If I change the DNS IP in DNS -> Nameservers in the web interface to 1.1.1.1, the error message on the peer almost immediately disappears:

Nameservers: 
  [1.1.1.1:53] for [flusso.nl] is Available

Switch it back in the interface to the peer address and the error reappears:

Nameservers: 
  [100.97.168.236:53] for [flusso.nl] is Unavailable, reason: 1 error occurred:
	* read udp 100.97.56.42:50421->100.97.168.236:53: i/o timeout

But it's definitely working:

$ dig +short vpn-test.flusso.nl
100.97.94.224

fboender avatar Apr 10 '25 06:04 fboender

The Nameservers are usually checked for the first time before the Peer connectivity is established and fail immediately, then the handler checks the connectivity only periodically. It can take anywhere from a few seconds to up to half a minute to be retried and properly register with the operating system.

Could it explain the behaviour you are observing?

Same problem here. Before, I could wait a minute and the DNS would be back online, but this doesn't work anymore after a recent update. It should be a rather common practice to let a peer serve internal DNS, otherwise I can't imagine why people need split-horizon DNS. It's a pity that NetBird doesn't give any special treatment for this. At least it should detect DNS server availability much more actively (or just let user decide how long that period should be).

ZnqbuZ avatar May 31 '25 19:05 ZnqbuZ