Can't ping connected peer
Describe the problem
I have 2 NetbBird peers:
- elk peer (with Elastiksearch Heartbeat service) - 100.81.167.156 and
- node peer - 100.81.94.114 Periodically (last time very often) I can't ping 2-nd peer from 1-st. It's weird, because on 1-st peer netbird reports that second peer is connected.
After I run netbird down && netbird up on both(!) nodes it started working.
It doesn't working after netbird down && netbird up only on elk peer.
Are you using NetBird Cloud?
No, Self-hosted NetBird.
NetBird version
1 peer - 0.39.2 2 peer - 0.39.1
Is any other VPN software installed?
No
Debug output
admin@elk:~/elk/current$ netbird status -d
Peers detail:
...
node-2.netbird.selfhosted:
NetBird IP: 100.81.94.114
Public key: A0/k9FWRkF+JspDjOVIhk0YaaRDvvTZo3C+kEL0feR0=
Status: Connected
-- detail --
Connection type: Relayed
ICE candidate (Local/Remote): -/-
ICE candidate endpoints (Local/Remote): -/-
Relay server address: rels://gateway.example.com:443
Last connection update: 15 seconds ago
Last WireGuard handshake: -
Transfer status (received/sent) 0 B/444 B
Quantum resistance: false
Networks: -
Latency: 0s
...
Events:
[INFO] SYSTEM (d2d1fc72-dc9b-4114-aea5-9079162218c0)
Message: Network map updated
Time: 22 hours, 39 minutes ago
[INFO] SYSTEM (4f3f9e75-5031-414f-ba1b-5dfa8dfbdd91)
Message: Network map updated
Time: 22 hours, 13 minutes ago
[INFO] SYSTEM (1c730277-76e9-4ace-9d9f-34ab29069d67)
Message: Network map updated
Time: 22 hours, 11 minutes ago
[INFO] SYSTEM (1485f61f-552f-4b6b-a6ee-16520b756222)
Message: Network map updated
Time: 22 hours, 10 minutes ago
[INFO] SYSTEM (0cdd272a-4276-43ff-b534-a678db80756b)
Message: Network map updated
Time: 22 hours, 9 minutes ago
[INFO] SYSTEM (4d81d7bf-3b5d-4a7a-a4a9-8b72e5dfd363)
Message: Network map updated
Time: 22 hours, 9 minutes ago
[INFO] SYSTEM (1d4823f6-2569-4f39-ad52-2ceba201c06e)
Message: Network map updated
Time: 22 hours, 9 minutes ago
[INFO] SYSTEM (5b65f7e8-fcf9-40ce-ab84-a27cbbe70d25)
Message: Network map updated
Time: 8 hours, 34 minutes ago
[WARNING] DNS (fa33e8ad-6363-4b8a-b2a5-a009f542bd18)
Message: All upstream servers failed (probe failed)
Time: 15 seconds ago
Metadata: upstreams: 172.31.xxx.xxx:53
[INFO] SYSTEM (e86e671e-79b8-4871-8544-e63c65d777d1)
Message: Network map updated
Time: 15 seconds ago
OS: linux/amd64
Daemon version: 0.39.2
CLI version: 0.39.2
Management: Connected to https://gateway.example.com:443
Signal: Connected to https://gateway.example.com:443
Relays:
[stun:gateway.example.com:3478] is Available
[turn:gateway.example.com:3478?transport=udp] is Available
[rels://gateway.example.com:443] is Available
Nameservers:
[172.31.xxx.xxx:53] for [.] is Available
FQDN: elk.netbird.selfhosted
NetBird IP: 100.81.167.156/16
Interface type: Kernel
Quantum resistance: false
Networks: -
Forwarding rules: 0
Peers count: 15/21 Connected
Screenshots
Additional context
Add any other context about the problem here.
Have you tried these troubleshooting steps?
- [x] Checked for newer NetBird versions
- [x] Searched for similar issues on GitHub (including closed ones)
- [x] Restarted the NetBird client
- [x] Disabled other VPN software
- [x] Checked firewall settings
Related issues I feel like I've run into this problem again:
@netandreus, it seems like there was no wireguard handshake between the peers. In some cases this may happen on linux due to the agent's relay proxy being using the wrong addressing.
Can you install the wireguard-tools package on both nodes and run the following commands and share their outputs?
sudo wg show
netbird status -d
I am currently experiencing this issue as well
@mlsmaycon It's wired, sudo wg show on the node-2 shows nothing. And netbird status -d shows that only one node connected (mailer).
Just for your information: node-3 (main) and node-2 (second) provides HA-routes to the my VLANs. May be it makes sense for this situation.
Here are detailed logs. I’m really counting on your help.
ELK
sudo wg show
admin@elk:~/elk/current$ sudo wg show
interface: wt0
public key: P0Xd+rb5EjqfaFXIL/KuQ0yGKHT4qTa99Mz4ABrshRA=
private key: (hidden)
listening port: 51820
fwmark: 0x1bd00
peer: WwozlULtO2O17Xm+pkZP/Q3V4IsS+O6f/iA269tLRCs=
endpoint: 127.0.0.1:38406
allowed ips: 100.81.167.93/32
latest handshake: Now
transfer: 1.78 MiB received, 2.07 MiB sent
persistent keepalive: every 25 seconds
peer: Sf0KDTD0Es7T1396ShukzI7Ca4uTALlyPaX3flUjgBs=
endpoint: 5.166.71.0:51820
allowed ips: 100.81.167.67/32
latest handshake: 15 seconds ago
transfer: 250.76 KiB received, 69.13 KiB sent
persistent keepalive: every 25 seconds
peer: I7KwutIWalMYzNhVOpCiOBibnQcwnEmTVMntNtsAT2w=
endpoint: 83.111.22.107:53888
allowed ips: 100.81.158.204/32
latest handshake: 18 seconds ago
transfer: 3.97 KiB received, 1.34 KiB sent
persistent keepalive: every 25 seconds
peer: ybARDYtVgFRDWernX1O10h8X8k9zj2Tzocfz5D6XIUw=
endpoint: 83.111.22.107:64064
allowed ips: 100.81.83.4/32
latest handshake: 23 seconds ago
transfer: 134.44 KiB received, 2.92 MiB sent
persistent keepalive: every 25 seconds
peer: Py38isXqCLhp8htbLCOGKB3V/skGixJg81WTtD69BRw=
endpoint: 127.0.0.1:59332
allowed ips: 100.81.164.248/32
latest handshake: 36 seconds ago
transfer: 1.78 MiB received, 2.08 MiB sent
persistent keepalive: every 25 seconds
peer: O4zyMZPtGANVeeN8mBWH21ag8A6sz5z1Nc0xIc70dlY=
endpoint: 127.0.0.1:48462
allowed ips: 100.81.60.214/32
latest handshake: 52 seconds ago
transfer: 166.51 KiB received, 98.41 KiB sent
persistent keepalive: every 25 seconds
peer: 9i5W38NvBSXk7oK+v0KeaXn0csEn5AOXIG2mg3TwOAo=
endpoint: 127.0.0.1:33385
allowed ips: 100.81.65.41/32, 172.31.255.254/32, 10.10.100.0/24, 10.10.100.8/32, 10.10.99.0/24, 10.10.109.0/24
latest handshake: 1 minute, 9 seconds ago
transfer: 1.74 GiB received, 249.93 MiB sent
persistent keepalive: every 25 seconds
peer: Il+HTYwpJkAZ+bc+bb9hyhJElRdTfalYVJTzhv1Z1iY=
endpoint: 127.0.0.1:42704
allowed ips: 100.81.209.144/32
latest handshake: 1 minute, 21 seconds ago
transfer: 1.56 MiB received, 1.82 MiB sent
persistent keepalive: every 25 seconds
peer: n2NFRfS1DwfJYT8hgqf0zuaBpwxPZryTqdu9eltHYis=
endpoint: 83.111.22.107:45792
allowed ips: 100.81.70.185/32
latest handshake: 1 minute, 52 seconds ago
transfer: 9.47 MiB received, 11.11 MiB sent
persistent keepalive: every 25 seconds
peer: kybNlgmNYCYG7VFvUzfO/iNCsIx9omEsnRhcIWz25l8=
endpoint: 94.59.172.212:51820
allowed ips: 100.81.30.237/32
latest handshake: 1 minute, 58 seconds ago
transfer: 352.38 KiB received, 317.17 KiB sent
persistent keepalive: every 25 seconds
peer: WDAKtiWfo8PcYAUdGGLYmlKY2d2LXKJL4MT5t+MwOW4=
endpoint: 20.174.16.61:51820
allowed ips: 10.0.34.0/24, 10.0.0.0/16, 100.81.194.237/32
latest handshake: 1 minute, 58 seconds ago
transfer: 31.94 MiB received, 12.75 MiB sent
persistent keepalive: every 25 seconds
peer: jJBPh5bJwj21wI/vtfp00U9m8OkCvGL3WjE4lWoDT0Y=
endpoint: 20.174.16.192:51820
allowed ips: 10.0.35.0/24, 100.81.54.220/32
latest handshake: 1 minute, 58 seconds ago
transfer: 30.33 MiB received, 11.30 MiB sent
persistent keepalive: every 25 seconds
peer: l/qo42s5SlYe9xAZl5RGOEBcQG2Bh9cF3xGSBAgPvUc=
endpoint: 192.168.11.74:51820
allowed ips: 100.81.73.30/32
latest handshake: 1 minute, 58 seconds ago
transfer: 311.00 KiB received, 375.11 KiB sent
persistent keepalive: every 25 seconds
peer: 5880lthanHV8ZxE7m0hqRTDZrEbs+lXUGHHxqQ6i0A0=
endpoint: 85.235.166.234:7605
allowed ips: 100.81.71.220/32
latest handshake: 2 minutes, 22 seconds ago
transfer: 354.88 KiB received, 312.35 KiB sent
persistent keepalive: every 25 seconds
peer: S7lfe891aqfCshgL467jEBPluQY4hjVd/E1fMCL0jWI=
endpoint: 85.235.166.234:51820
allowed ips: 100.81.65.209/32
latest handshake: 2 minutes, 22 seconds ago
transfer: 360.44 KiB received, 306.51 KiB sent
persistent keepalive: every 25 seconds
admin@elk:~/elk/current$
netbird status -d
admin@elk:~/elk/current$ netbird status -d
Peers detail:
...
node-3.netbird.selfhosted:
NetBird IP: 100.81.65.41
Public key: 9i5W38NvBSXk7oK+v0KeaXn0csEn5AOXIG2mg3TwOAo=
Status: Connected
-- detail --
Connection type: Relayed
ICE candidate (Local/Remote): -/-
ICE candidate endpoints (Local/Remote): -/-
Relay server address: rels://xxx.xxx.com:443
Last connection update: 1 day, 2 hours ago
Last WireGuard handshake: 23 seconds ago
Transfer status (received/sent) 1.7 GiB/250.1 MiB
Quantum resistance: false
Networks: 10.10.100.0/24, 10.10.100.8/32, 10.10.109.0/24, 10.10.99.0/24, 172.31.255.254/32
Latency: 0s
...
node-2.netbird.selfhosted:
NetBird IP: 100.81.94.114
Public key: A0/k9FWRkF+JspDjOVIhk0YaaRDvvTZo3C+kEL0feR0=
Status: Connected
-- detail --
Connection type: Relayed
ICE candidate (Local/Remote): -/-
ICE candidate endpoints (Local/Remote): -/-
Relay server address: rels://xxx.xxx.com:443
Last connection update: 10 seconds ago
Last WireGuard handshake: -
Transfer status (received/sent) 0 B/444 B
Quantum resistance: false
Networks: -
Latency: 0s
Events:
[INFO] SYSTEM (421617f3-f964-4077-855e-9b451cd285d7)
Message: Network map updated
Time: 1 day, 1 hour ago
[INFO] SYSTEM (74eed161-4d9e-4d3b-b37c-1059aca6aef7)
Message: Network map updated
Time: 1 day, 1 hour ago
[INFO] SYSTEM (1c3d52a9-af3b-4f7b-bb7c-476f6028f67b)
Message: Network map updated
Time: 1 day, 1 hour ago
[INFO] SYSTEM (8323274d-b530-4d2e-9851-92790cb88061)
Message: Network map updated
Time: 23 hours, 4 minutes ago
[INFO] SYSTEM (a36e1bb3-80af-4b0b-80e6-00f6d00bf60d)
Message: Network map updated
Time: 23 hours, 3 minutes ago
[INFO] SYSTEM (1d87b354-f438-46f6-a046-4763e9f4a8e8)
Message: Network map updated
Time: 22 hours, 57 minutes ago
[INFO] SYSTEM (1433b335-4c98-4d48-b8c8-78d796c21df4)
Message: Network map updated
Time: 22 hours, 57 minutes ago
[WARNING] DNS (1b9a75c1-b292-45e3-9577-9bf9bd87973d)
Message: All upstream servers failed (fail count exceeded)
Time: 14 hours, 20 minutes ago
Metadata: upstreams: 172.31.255.254:53
[WARNING] DNS (1b504ac4-28a5-49ba-9f58-bf0161212aca)
Message: All upstream servers failed (fail count exceeded)
Time: 14 hours, 8 minutes ago
Metadata: upstreams: 172.31.255.254:53
[INFO] SYSTEM (7123289f-2b83-4168-87bb-4fe753c5e53f)
Message: Network map updated
Time: 10 hours, 57 minutes ago
OS: linux/amd64
Daemon version: 0.39.2
CLI version: 0.39.2
Management: Connected to https://xxx.xxx.com:443
Signal: Connected to https://xxx.xxx.com:443
Relays:
[stun:xxx.xxx.com:3478] is Available
[turn:xxx.xxx.com:3478?transport=udp] is Available
[rels://xxx.xxx.com:443] is Available
Nameservers:
[172.31.255.254:53] for [.] is Available
FQDN: elk.netbird.selfhosted
NetBird IP: 100.81.167.156/16
Interface type: Kernel
Quantum resistance: false
Networks: -
Forwarding rules: 0
Peers count: 16/22 Connected
node-2
sudo wg show
admin@vpn-node-2:~$ sudo wg show
admin@vpn-node-2:~$ netbird status
OS: linux/amd64
Daemon version: 0.39.1
CLI version: 0.39.1
Management: Connected
Signal: Connected
Relays: 3/3 Available
Nameservers: 0/0 Available
FQDN:
NetBird IP: N/A
Interface type: N/A
Quantum resistance: false
Networks: -
Forwarding rules: 0
Peers count: 1/65 Connected
admin@vpn-node-2:~$
netbird status -d
Peers detail:
peer-x.netbird.selfhosted:
NetBird IP: 100.81.0.152
Public key: vNdIX0gBFbKVq7BDNDO2CQ82YEo7CTK2XZzx1/W7dh8=
Status: Disconnected
-- detail --
Connection type:
ICE candidate (Local/Remote): -/-
ICE candidate endpoints (Local/Remote): -/-
Relay server address:
Last connection update: 9 hours, 7 minutes ago
Last WireGuard handshake: -
Transfer status (received/sent) 0 B/0 B
Quantum resistance: false
Networks: -
Latency: 0s
peer-x.netbird.selfhosted:
NetBird IP: 100.81.3.116
Public key: WESL0LDOEaxCRU4NPDE3gG7RjFcSSX6c0fHZCPw1YnQ=
Status: Disconnected
-- detail --
Connection type:
ICE candidate (Local/Remote): -/-
ICE candidate endpoints (Local/Remote): -/-
Relay server address:
Last connection update: -
Last WireGuard handshake: -
Transfer status (received/sent) 0 B/0 B
Quantum resistance: false
Networks: -
Latency: 0s
peer-x.netbird.selfhosted:
NetBird IP: 100.81.9.229
Public key: 8XnSq88Rl8iUttZn7PEgcVSLcCCglbNth8xyhylRcXI=
Status: Disconnected
-- detail --
Connection type:
ICE candidate (Local/Remote): -/-
ICE candidate endpoints (Local/Remote): -/-
Relay server address:
Last connection update: -
Last WireGuard handshake: -
Transfer status (received/sent) 0 B/0 B
Quantum resistance: false
Networks: -
Latency: 172.832051ms
...
mailer-h1o.netbird.selfhosted:
NetBird IP: 100.81.42.180
Public key: Qe/lDsFFMZEQN7gawfQak+MaRLQSgIbWfmZMu2I6M1s=
Status: Connected
-- detail --
Connection type: Relayed
ICE candidate (Local/Remote): -/-
ICE candidate endpoints (Local/Remote): -/-
Relay server address: rels://xxx.xxx.com:443
Last connection update: 9 hours, 8 minutes ago
Last WireGuard handshake: -
Transfer status (received/sent) 0 B/0 B
Quantum resistance: false
Networks: -
Latency: 426.631µs
peer-x.netbird.selfhosted:
NetBird IP: 100.81.44.192
Public key: RC6U/3lSVZ75VrrxznoDOcVij4BrUPY/JxHoGN1+vmg=
Status: Disconnected
-- detail --
Connection type:
ICE candidate (Local/Remote): -/-
ICE candidate endpoints (Local/Remote): -/-
Relay server address:
Last connection update: 21 minutes, 10 seconds ago
Last WireGuard handshake: -
Transfer status (received/sent) 0 B/0 B
Quantum resistance: false
Networks: -
Latency: 0s
peer-x.netbird.selfhosted:
NetBird IP: 100.81.46.91
Public key: joOx0EAYlCutKpYSRrAYZTVHIvHmO4+SGT5lWOlikQo=
Status: Disconnected
-- detail --
Connection type:
ICE candidate (Local/Remote): -/-
ICE candidate endpoints (Local/Remote): -/-
Relay server address:
Last connection update: -
Last WireGuard handshake: -
Transfer status (received/sent) 0 B/0 B
Quantum resistance: false
Networks: -
Latency: 0s
peer-x.netbird.selfhosted:
NetBird IP: 100.81.47.99
Public key: 6IkrTmW5iCDguN+pF9gnCILYmSus8MFT57pc124sJy8=
Status: Disconnected
-- detail --
Connection type:
ICE candidate (Local/Remote): -/-
ICE candidate endpoints (Local/Remote): -/-
Relay server address:
Last connection update: 9 hours, 1 minutes ago
Last WireGuard handshake: -
Transfer status (received/sent) 0 B/0 B
Quantum resistance: false
Networks: -
Latency: 0s
...
elk.netbird.selfhosted:
NetBird IP: 100.81.167.156
Public key: P0Xd+rb5EjqfaFXIL/KuQ0yGKHT4qTa99Mz4ABrshRA=
Status: Disconnected
-- detail --
Connection type:
ICE candidate (Local/Remote): -/-
ICE candidate endpoints (Local/Remote): -/-
Relay server address:
Last connection update: 9 hours, 7 minutes ago
Last WireGuard handshake: -
Transfer status (received/sent) 0 B/0 B
Quantum resistance: false
Networks: -
Latency: 0s
...
peer-x.netbird.selfhosted:
NetBird IP: 100.81.255.169
Public key: +j04oxpm3AEcNeC56aqK/Jn+4GcrcEIhEG2vj1D/Xkc=
Status: Disconnected
-- detail --
Connection type:
ICE candidate (Local/Remote): -/-
ICE candidate endpoints (Local/Remote): -/-
Relay server address:
Last connection update: 9 hours, 4 minutes ago
Last WireGuard handshake: -
Transfer status (received/sent) 0 B/0 B
Quantum resistance: false
Networks: -
Latency: 0s
Events:
[INFO] SYSTEM (4dcf13f8-58fa-4cd7-bdd5-724e17858a61)
Message: Network map updated
Time: 1 day, 1 hour ago
[INFO] SYSTEM (aaf70a6b-2c20-47c6-9920-462bbdc5a6fc)
Message: Network map updated
Time: 1 day, 1 hour ago
[INFO] SYSTEM (825e57c0-de80-4c84-87b5-a9bd17689670)
Message: Network map updated
Time: 1 day, 1 hour ago
[INFO] SYSTEM (5d1dc5d9-c453-4d7a-a4c3-f676c0135508)
Message: Network map updated
Time: 23 hours, 14 minutes ago
[INFO] SYSTEM (171ac399-a8e3-47d6-9e44-3cbdb9060bc3)
Message: Network map updated
Time: 23 hours, 14 minutes ago
[INFO] SYSTEM (ad2afed7-3cf8-4c01-b54b-f14188438aff)
Message: Network map updated
Time: 23 hours, 8 minutes ago
[INFO] SYSTEM (570f020e-ef88-44e8-91d1-b1805a8edaf8)
Message: Network map updated
Time: 23 hours, 8 minutes ago
[INFO] SYSTEM (04eb9d77-b78e-4670-8a2a-926b758a9dd9)
Message: Network map updated
Time: 11 hours, 8 minutes ago
[INFO] SYSTEM (a3587e1d-c7d4-4249-bdf6-bbc62040cc3a)
Message: Network map updated
Time: 9 hours, 8 minutes ago
[INFO] SYSTEM (067e59a4-6455-4a19-b2d3-640e12278633)
Message: Network map updated
Time: 9 hours, 8 minutes ago
OS: linux/amd64
Daemon version: 0.39.1
CLI version: 0.39.1
Management: Connected to https://xxx.xxx.com:443
Signal: Connected to https://xxx.xxx.com:443
Relays:
[stun:xxx.xxx.com:3478] is Available
[turn:xxx.xxx.com:3478?transport=udp] is Available
[rels://xxx.xxx.com:443] is Available
Nameservers:
FQDN:
NetBird IP: N/A
Interface type: N/A
Quantum resistance: false
Networks: -
Forwarding rules: 0
Peers count: 1/65 Connected
Have you checked the firewalls?
@bareksml Yes, there is no firewalls restriction. And if I just restart netbird on both nodes - all will be fine. But this issue occurs very often.
Hello @mlsmaycon The issue is still exists. I sent debug logs netbird debug for 1m -AS from the elk node to the support at netbird.io What else can I do to solve the problem?
@pappz It's so unstable. We can't use netbird, when peers became unavailable periodically. May be you can assist with this?
Sure! Could you send me the link to the debug bundle pkg via Slack?
@pappz Yes, I sent it to you at Slack.
@netandreus Could you run a test with the latest version of NetBird? I don’t see any evidence in the logs, but we made some changes that might affect your case.
Thank you! I updated it.
admin@vpn-node-2:~$ netbird version
0.39.1
admin@vpn-node-2:~$ netbird version
0.41.3
Will monitor this week.
Day two — so far, so good.
@netandreus Did the upgrade solve your issue?
Three weeks all was fine, but today this error got back. I will send you a detailed report tomorrow.
Hello, @nazarewk !
Current status
Now the status is like this:
I have remote peer (ELK) and NS-server located behind Mikrotik. Route to the Mikrotik's DNS (172.31.255.254/32) provided by 2 NetBird peers in HA-mode: main (node-3) and backup (node-2)
node-3
vpn-uae-node-3.netbird.selfhosted:
NetBird IP: 100.81.65.41
Public key: 9i5W38NvBSXk7oK+v0KeaXn0csEn5AOXIG2mg3TwOAo=
Status: Connected
-- detail --
Connection type: Relayed
ICE candidate (Local/Remote): -/-
ICE candidate endpoints (Local/Remote): -/-
Relay server address: rels://xxx.yyy.com:443
Last connection update: 10 hours, 4 minutes ago
Last WireGuard handshake: 1 minute, 54 seconds ago
Transfer status (received/sent) 698.9 MiB/103.4 MiB
Quantum resistance: false
Networks: 10.10.100.0/24, 10.10.100.8/32, 10.10.106.0/24, 10.10.109.0/24, 10.10.99.0/24, 172.31.255.254/32
Latency: 0s
node-2
vpn-uae-node-2.netbird.selfhosted:
NetBird IP: 100.81.94.114
Public key: A0/k9FWRkF+JspDjOVIhk0YaaRDvvTZo3C+kEL0feR0=
Status: Connected
-- detail --
Connection type: Relayed
ICE candidate (Local/Remote): -/-
ICE candidate endpoints (Local/Remote): -/-
Relay server address: rels://xxx.yyy.com:443
Last connection update: 10 hours, 4 minutes ago
Last WireGuard handshake: 1 minute, 54 seconds ago
Transfer status (received/sent) 132.2 KiB/123.4 KiB
Quantum resistance: false
Networks: -
Latency: 0s
Keypoints
- both peers are pinged from ELK peer
- NS-server pinged from both peers
- NS-server doesn't pinged from ELK peer
I sent you debug bundle to Slack. Maybe you or @pappz have some ideas of how to fix it?
Update
- I ran
netbird down && netbird upon the ELK peer without result. - After I made full reboot of entire ELK VM I can ping NS1-server.
Same issue here
Same here
Have the same issue for docker environment. Linux legacy instllation solve problem.