netbird icon indicating copy to clipboard operation
netbird copied to clipboard

DNS route timeout when ephemeral routing peers are replaced

Open Cristobal-M opened this issue 4 months ago • 1 comments

I'm testing Netbird selfhosting in a VPS and 2 peers deployed as pods in a Kubernetes cluster in the same group. A DNS Route like *.dev-env.svc.cluster.local allows me to access internal services of the cluster

If one of the pods or both are replaced, DNS requests timeouts for at least one minute or until I manually delete the disconnected peer:

$ dig test.dev-env.svc.cluster.local
;; communications error to 127.0.0.53#53: timed out
;; communications error to 127.0.0.53#53: timed out
;; communications error to 127.0.0.53#53: timed out
....

Another DNS route assigned to another peer keeps working.

To Reproduce

Steps to reproduce the behavior:

  1. Setup a peer for another or current machine (peerA)
  2. Deploy at least two ephemeral peers in a kubernetes cluster (setup key unlimited uses etc)
  3. Assign them to a new group
  4. Create DNS route assigned to the previous created group, distributed to all
  5. Test the dns route in peerA, browser and dig command, it works properly
  6. Trigger a rollout/restart for the Kubernetes deployment controlling the pods
  7. Observe creation of new peers and disconnection of old
  8. Test the dns route in peerA again, NX_DOMAIN error in browser and timed out in dig

Are you using NetBird Cloud? No, self hosted

NetBird version

0.46.0

Is any other VPN software installed?

Common Wireguard but stopped for this test

Debug output

Peers detail:
 k8s-dev-router.netbird.selfhosted:
  NetBird IP: 100.98.59.220
  Public key: gW5aEA8ny7halNuCp3np+BC8Y84dLxlSUk3FYMHoTg0=
  Status: Connecting
  -- detail --
  Connection type: 
  ICE candidate (Local/Remote): -/-
  ICE candidate endpoints (Local/Remote): -/-
  Relay server address: 
  Last connection update: 3 minutes, 36 seconds ago
  Last WireGuard handshake: -
  Transfer status (received/sent) 0 B/0 B
  Quantum resistance: false
  Networks: -
  Latency: 0s

 test-peer.netbird.selfhosted:
  NetBird IP: 100.98.87.4
  Public key: rNhvc4NDCb67QPiMm4OKxrE7BqY7SNid5TmOYmt5Rio=
  Status: Connected
  -- detail --
  Connection type: P2P
  ICE candidate (Local/Remote): srflx/srflx
  ICE candidate endpoints (Local/Remote): 198.51.100.0:51820/198.51.100.1:51820
  Relay server address: rels://nb.anon-rWzSB.domain:443
  Last connection update: 22 minutes, 42 seconds ago
  Last WireGuard handshake: 30 seconds ago
  Transfer status (received/sent) 2.4 MiB/939.5 KiB
  Quantum resistance: false
  Networks: a73662cc2c604018f3de08d55fee511b.gr7.eu-west-1.eks.anon-A6RtO.domain, anon-7dMuL.domain
  Latency: 52.457204ms

 k8s-dev-router-3.netbird.selfhosted:
  NetBird IP: 100.98.158.169
  Public key: juaSB43wMePYBR+7+yWfl3+oAo0ExLyK1CfWImU/2FY=
  Status: Connected
  -- detail --
  Connection type: Relayed
  ICE candidate (Local/Remote): -/-
  ICE candidate endpoints (Local/Remote): -/-
  Relay server address: rels://nb.anon-rWzSB.domain:443
  Last connection update: 6 minutes, 19 seconds ago
  Last WireGuard handshake: 24 seconds ago
  Transfer status (received/sent) 904 B/780 B
  Quantum resistance: false
  Networks: -
  Latency: 0s

 k8s-dev-router-2.netbird.selfhosted:
  NetBird IP: 100.98.189.199
  Public key: 1G0HC0qKEm/4PHcc3mKf3zhOI/Lx6Bs9KPNknI6k+18=
  Status: Connected
  -- detail --
  Connection type: Relayed
  ICE candidate (Local/Remote): -/-
  ICE candidate endpoints (Local/Remote): -/-
  Relay server address: rels://nb.anon-rWzSB.domain:443
  Last connection update: 6 minutes, 21 seconds ago
  Last WireGuard handshake: 1 minute, 35 seconds ago
  Transfer status (received/sent) 2.5 KiB/2.9 KiB
  Quantum resistance: false
  Networks: *.dev-env.svc.anon-qqen3.domain, hello1.dev-env.svc.anon-qqen3.domain
  Latency: 0s

 k8s-dev-router-1.netbird.selfhosted:
  NetBird IP: 100.98.244.230
  Public key: XkFtel/qGVJj0H5aJNJ/ecMcWIMZ8sd+HITi/6nNryc=
  Status: Connecting
  -- detail --
  Connection type: 
  ICE candidate (Local/Remote): -/-
  ICE candidate endpoints (Local/Remote): -/-
  Relay server address: 
  Last connection update: 3 minutes, 47 seconds ago
  Last WireGuard handshake: -
  Transfer status (received/sent) 0 B/0 B
  Quantum resistance: false
  Networks: -
  Latency: 0s

Events:
  [INFO] SYSTEM (7371653a-bc1d-4508-9397-64b638d7c996)
    Message: Network map updated
    Time: 24 minutes, 41 seconds ago
  [INFO] SYSTEM (0f3d4d55-8823-4ce7-9f50-ccf668c71b8d)
    Message: Network map updated
    Time: 24 minutes, 15 seconds ago
  [INFO] SYSTEM (3ac976a2-3f1a-414e-9be1-0af0cfbc57d0)
    Message: Network map updated
    Time: 23 minutes, 26 seconds ago
  [INFO] SYSTEM (81a9359f-859a-470d-bea9-3e126b6c9025)
    Message: Network map updated
    Time: 23 minutes ago
  [INFO] SYSTEM (d9612b9e-6ef8-4873-bfa3-689c10fb5a54)
    Message: Network map updated
    Time: 22 minutes, 22 seconds ago
  [INFO] SYSTEM (43b9fe62-2c78-44bd-9502-568b3ccdf919)
    Message: Network map updated
    Time: 22 minutes, 21 seconds ago
  [INFO] SYSTEM (36059f9b-ea1b-4cb7-8ede-aae388c58da7)
    Message: Network map updated
    Time: 12 minutes, 22 seconds ago
  [INFO] SYSTEM (d04aa507-1ac2-45b5-b503-07e07c88bf1f)
    Message: Network map updated
    Time: 12 minutes, 20 seconds ago
  [INFO] SYSTEM (84d9dfc8-da1b-4489-baad-42bcbdba70bc)
    Message: Network map updated
    Time: 6 minutes, 21 seconds ago
  [INFO] SYSTEM (d1af41c1-9c82-4f30-981c-3705ce0845b5)
    Message: Network map updated
    Time: 6 minutes, 20 seconds ago
OS: linux/amd64
Daemon version: 0.46.0
CLI version: 0.46.0
Management: Connected to https://nb.anon-rWzSB.domain:443
Signal: Connected to https://nb.anon-rWzSB.domain:443
Relays: 
  [stun:nb.anon-rWzSB.domain:3478] is Available
  [turn:nb.anon-rWzSB.domain:3478?transport=udp] is Available
  [rels://nb.anon-rWzSB.domain:443] is Available
Nameservers: 
FQDN: cristobal.netbird.selfhosted
NetBird IP: 100.98.108.128/16
Interface type: Kernel
Quantum resistance: false
Lazy connection: false
Networks: -
Forwarding rules: 0
Peers count: 3/5 Connected

netbird.debug.1109401306.zip

Screenshots

Image1

Have you tried these troubleshooting steps?

  • [ ] Reviewed client troubleshooting (if applicable)
  • [x] Checked for newer NetBird versions
  • [x] Searched for similar issues on GitHub (including closed ones)
  • [ ] Restarted the NetBird client
  • [x] Disabled other VPN software
  • [x] Checked firewall settings

Cristobal-M avatar Jun 13 '25 17:06 Cristobal-M