netbird icon indicating copy to clipboard operation
netbird copied to clipboard

Netbird running inside unprivileged LXC stops working after Proxmox update - likely proxmox kernel 6.17.4-1 issue

Open LordAnchemis opened this issue 2 months ago • 3 comments

Netbird inside LXC container disconnected after proxmox update Appears to have no connection to signal service

Proxmox version: 9.1.1 LXC template: debian-13.1-2-standard Netbird version: 0.60.7

Updated proxmox to 9.1.2

  • list of updated packages in proxmox host that may have borked netbird
qemu-server:amd64 9.1.1
qemu-server:amd64 9.1.2

man-db:amd64 2.13.1-1
dbus:amd64 1.16.2-2

libpve-common-perl:all 9.1.0
libpve-common-perl:all 9.1.1

libpve-access-control:all 9.0.4
libpve-access-control:all 9.0.5

libpve-network-perl:all 1.2.3
libpve-network-perl:all 1.2.4
libpve-network-api-perl:all 1.2.3
libpve-network-api-perl:all 1.2.4

libpve-rs-perl:amd64 0.11.3
libpve-rs-perl:amd64 0.11.4

pve-manager:all 9.1.2

pve-i18n:all 3.6.5
pve-i18n:all 3.6.6

pve-ha-manager:amd64 5.0.8

pve-yew-mobile-i18n:all 3.6.5
pve-yew-mobile-i18n:all 3.6.6

proxmox-widget-toolkit:all 5.1.2
proxmox-widget-toolkit:all 5.1.5

proxmox-kernel-6.17:all 6.17.2-2
proxmox-kernel-6.17.2-1-pve-signed:amd64 6.17.2-1

proxmox-kernel-6.17:all 6.17.4-1
proxmox-kernel-6.17.4-1-pve-signed:amd64 <none> 6.17.4-1
proxmox-kernel-6.17.4-1-pve-signed:amd64 6.17.4-1

Netbird disconnected Unable to reconnect

To Reproduce

Steps to reproduce the behavior:

  1. Install Proxmox
  2. Update to latest version 9.1.2
  3. Create unpriviledged LXC using debian-13.1 trixie template
  4. Install netbird via script - fails as unable to connect to github.com:443 Netbird can be manually installed by adding repo method
  5. Netbird up fails - see error message below

Expected behavior

Netbird works

Are you using NetBird Cloud?

No

Please specify whether you use NetBird Cloud or self-host NetBird's control plane.

NetBird version

0.60.7 - same problem occurs with 0.60.8

Is any other VPN software installed?

No

Debug output

netbird status -dA

Peers detail:
Events: No events recorded
OS: linux/amd64
Daemon version: 0.60.7
CLI version: 0.60.7
Profile: default
Management: Connected to https://api.netbird.io:443
Signal: Disconnected
Relays: 
Nameservers: 
FQDN: yourdevice.netbird.cloud
NetBird IP: 100.x.x.x/16
Interface type: Kernel
Quantum resistance: false
Lazy connection: false
SSH Server: Disabled
Networks: -
Forwarding rules: 0
Peers count: 0/0 Connected

Logs:

Before failure

INFO [peer: XXX] client/internal/peer/handshaker.go:159: sending offer with serial: XXX

** repeats of these - one for each peer I have on the network (not copying all of these for brevity

Probably - during proxmox update looking at the logs

INFO client/cmd/root.go:193: shutdown signal received
INFO [peer: XXX] client/internal/peer/handshaker.go:114: stop listening for remote offers and answers
INFO client/internal/engine.go:292: Network monitor: stopped
INFO [relay: rels://streamline-es-mad1-0.relay.netbird.io:443] shared/relay/client/client.go:597: closing all peer connections
INFO [relay: rels://streamline-es-mad1-0.relay.netbird.io:443] shared/relay/client/client.go:370: start to Relay read loop exit
INFO [relay: rels://streamline-uk-lon1-1.relay.netbird.io:443] shared/relay/client/client.go:597: closing all peer connections
INFO [relay: rels://streamline-uk-lon1-1.relay.netbird.io:443] shared/relay/client/client.go:370: start to Relay read loop exit
INFO [relay: rels://streamline-es-mad1-1.relay.netbird.io:443] shared/relay/client/client.go:597: closing all peer connections
INFO [relay: rels://streamline-es-mad1-1.relay.netbird.io:443] shared/relay/client/client.go:370: start to Relay read loop exit
INFO client/internal/connect.go:303: ensuring wt0 is removed, Netbird engine context cancelled
INFO client/internal/wg_iface_monitor.go:58: Interface monitor: stopped for wt0
WARN client/internal/engine.go:537: WireGuard interface monitor: wg interface monitor stopped: context canceled
INFO [peer: XXX] client/internal/peer/handshaker.go:114: stop listening for remote offers and answers

** repeats of these - one for each peer I have on the network (not copying all of these for brevity)

INFO [relay: rels://streamline-es-mad1-0.relay.netbird.io:443] shared/relay/client/client.go:605: waiting for read loop to close
INFO [relay: rels://streamline-es-mad1-0.relay.netbird.io:443] shared/relay/client/client.go:607: relay connection closed
WARN [relay: rels://streamline-es-mad1-0.relay.netbird.io:443] shared/relay/client/client.go:588: relay connection was already marked as not running
INFO [relay: rels://streamline-uk-lon1-1.relay.netbird.io:443] shared/relay/client/client.go:605: waiting for read loop to close
INFO [relay: rels://streamline-uk-lon1-1.relay.netbird.io:443] shared/relay/client/client.go:607: relay connection closed
WARN [relay: rels://streamline-uk-lon1-1.relay.netbird.io:443] shared/relay/client/client.go:588: relay connection was already marked as not running
INFO [relay: rels://streamline-es-mad1-1.relay.netbird.io:443] shared/relay/client/client.go:605: waiting for read loop to close
INFO [relay: rels://streamline-es-mad1-1.relay.netbird.io:443] shared/relay/client/client.go:607: relay connection closed
WARN [relay: rels://streamline-es-mad1-1.relay.netbird.io:443] shared/relay/client/client.go:588: relay connection was already marked as not running

INFO client/ssh/config/manager.go:249: Removed NetBird SSH config: /etc/ssh/ssh_config.d/99-netbird.conf

INFO client/internal/engine.go:311: cleaning up status recorder states

INFO [peer: XXX] client/internal/peer/conn.go:228: close peer connection
INFO [peer: XXX] client/internal/peer/conn.go:262: peer connection closed
INFO [peer: XXX] client/internal/peer/conn.go:228: close peer connection
WARN [peer: XXX] client/internal/peer/worker_relay.go:124: failed to close relay connection: use of closed network connection

** repeats of these - one for each peer I have on the network (not copying all of these for brevity)

After borking netbird

INFO client/internal/routemanager/manager.go:307: Routing cleanup complete
ERRO client/iface/udpmux/universal.go:98: error while reading packet: shared socked stopped
INFO client/iface/iface.go:309: interface wt0 has been removed
INFO client/internal/engine.go:362: stopped Netbird Engine
INFO client/internal/engine.go:292: Network monitor: stopped
INFO client/internal/engine.go:311: cleaning up status recorder states
INFO client/internal/routemanager/manager.go:307: Routing cleanup complete
INFO client/internal/engine.go:362: stopped Netbird Engine
INFO client/internal/connect.go:313: stopped NetBird client
INFO shared/signal/client/worker.go:51: Message worker stopping due to context cancellation
INFO client/server/server.go:855: service is down
INFO client/cmd/service_controller.go:100: stopped NetBird service
INFO client/cmd/service_controller.go:27: starting NetBird service
INFO client/internal/statemanager/manager.go:412: cleaning up state ssh_config_state
INFO client/cmd/service_controller.go:74: started daemon server: /var/run/netbird.sock

After which, you just end up with this error log on repeat

ERRO shared/signal/client/grpc.go:70: failed to connect to the signalling server: create connection: dial context: context deadline exceeded
ERRO client/internal/connect.go:501: error while connecting to the Signal Exchange Service signal.netbird.io:443: create connection: dial context: context deadline exceeded
ERRO client/internal/connect.go:234: rpc error: code = FailedPrecondition desc = failed connecting to Signal Service : create connection: dial context: context deadline exceeded
INFO ./caller_not_available:0: 2025/12/19 13:21:16 WARNING: [core] [Channel #52 SubChannel #53]grpc: addrConn.createTransport failed to connect to {Addr: "signal.netbird.io:443", ServerName: "signal.netbird.io:443", BalancerAttributes: {"<%!p(pickfirstleaf.managedByPickfirstKeyType={})>": "<%!p(bool=true)>" }}. Err: connection error: desc = "transport: Error while dialing: nbnet.NewDialer().DialContext: dial tcp [2a04:3542:1000:910:2465:1fff:fe8a:2f9a]:443: connect: connection refused"
INFO ./caller_not_available:0: 2025/12/19 13:21:17 WARNING: [core] [Channel #52 SubChannel #53]grpc: addrConn.createTransport failed to connect to {Addr: "signal.netbird.io:443", ServerName: "signal.netbird.io:443", BalancerAttributes: {"<%!p(pickfirstleaf.managedByPickfirstKeyType={})>": "<%!p(bool=true)>" }}. Err: connection error: desc = "transport: Error while dialing: nbnet.NewDialer().DialContext: dial tcp [2a04:3542:1000:910:2465:1fff:fe8a:2f9a]:443: connect: connection refused"
INFO ./caller_not_available:0: 2025/12/19 13:21:18 WARNING: [core] [Channel #52 SubChannel #53]grpc: addrConn.createTransport failed to connect to {Addr: "signal.netbird.io:443", ServerName: "signal.netbird.io:443", BalancerAttributes: {"<%!p(pickfirstleaf.managedByPickfirstKeyType={})>": "<%!p(bool=true)>" }}. Err: connection error: desc = "transport: Error while dialing: nbnet.NewDialer().DialContext: dial tcp [2a04:3542:1000:910:2465:1fff:fe8a:2f9a]:443: connect: connection refused"
INFO ./caller_not_available:0: 2025/12/19 13:21:20 WARNING: [core] [Channel #52 SubChannel #53]grpc: addrConn.createTransport failed to connect to {Addr: "signal.netbird.io:443", ServerName: "signal.netbird.io:443", BalancerAttributes: {"<%!p(pickfirstleaf.managedByPickfirstKeyType={})>": "<%!p(bool=true)>" }}. Err: connection error: desc = "transport: Error while dialing: nbnet.NewDialer().DialContext: dial tcp [2a04:3542:1000:910:2465:1fff:fe8a:2f9a]:443: connect: connection refused"
INFO ./caller_not_available:0: 2025/12/19 13:21:24 WARNING: [core] [Channel #52 SubChannel #53]grpc: addrConn.createTransport failed to connect to {Addr: "signal.netbird.io:443", ServerName: "signal.netbird.io:443", BalancerAttributes: {"<%!p(pickfirstleaf.managedByPickfirstKeyType={})>": "<%!p(bool=true)>" }}. Err: connection error: desc = "transport: Error while dialing: nbnet.NewDialer().DialContext: dial tcp [2a04:3542:1000:910:2465:1fff:fe8a:2f9a]:443: connect: connection refused"
INFO ./caller_not_available:0: 2025/12/19 13:21:31 WARNING: [core] [Channel #52 SubChannel #53]grpc: addrConn.createTransport failed to connect to {Addr: "signal.netbird.io:443", ServerName: "signal.netbird.io:443", BalancerAttributes: {"<%!p(pickfirstleaf.managedByPickfirstKeyType={})>": "<%!p(bool=true)>" }}. Err: connection error: desc = "transport: Error while dialing: nbnet.NewDialer().DialContext: dial tcp [2a04:3542:1000:910:2465:1fff:fe8a:2f9a]:443: connect: connection refused"

Additional context

Asked on Reddit - other users suggested it is a possible Apparmor issue But following guide using lxc.apparmor.profile = undefined or allowing cgroups2 etc. does NOT solve the problem

Have you tried these troubleshooting steps?

  • [Y] Reviewed client troubleshooting (if applicable)
  • [Y] Checked for newer NetBird versions
  • [Y ] Searched for similar issues on GitHub (including closed ones)
  • [Y] Restarted the NetBird client
  • [Y] Disabled other VPN software
  • [Y] Checked firewall settings

LordAnchemis avatar Dec 18 '25 15:12 LordAnchemis

Guessing you're running the client in an unprivileged LXC? This is likely AppArmor + missing kernel capabilities after a recent Proxmox update. Recent Proxmox updates tightened LXC defaults. NetBird requires keyctl, netlink access, and tun device creation - which unprivileged LXCs now block by default.

Fixes:

Option 1: Relax LXC config (most common) Edit /etc/pve/lxc/.conf on the Proxmox host:

lxc.apparmor.profile: unconfined
lxc.cap.drop:
lxc.mount.auto: proc:rw sys:rw
lxc.cgroup2.devices.allow: c 10:200 rwm
lxc.cgroup2.devices.allow: c 10:229 rwm
Restart the container. 

Option 2: Enable nesting + keyctl

features: keyctl=1,nesting=1

Option 3: Run NetBird on in a VM instead of an LXC

shuuri-labs avatar Dec 19 '25 13:12 shuuri-labs

The problem is I've tried the first 2 fixes - and it still doesn't work

Netbird service still has difficulty connecting to signal.netbird.io:443 From the error logs it seems that the netbird IPv6 address [2a04:3542:1000:910:2465:1fff:fe8a:2f9a]:443 is getting refused

Option 1: Relax LXC config (most common) Edit /etc/pve/lxc/.conf on the Proxmox host:

-> tried this, doesn't work, same error

Option 2: Enable nesting + keyctl

-> tried this, doesn't work, same error

Option 3: Run NetBird on in a VM instead of an LXC

-> this works (my VM netbird clients are unaffected), but it defeats the point of an LXC

LordAnchemis avatar Dec 19 '25 13:12 LordAnchemis

UPDATE: appears to be fixed after Proxmox kernel update

  • proxmox-kernel-6.17.4-2
  • proxmox-kernel-6.14.11-5

LordAnchemis avatar Dec 31 '25 23:12 LordAnchemis