android-client icon indicating copy to clipboard operation
android-client copied to clipboard

Android app crashing when connecting - only if Rosenpass active

Open jhf2442 opened this issue 3 months ago • 4 comments

Title tells it - app starts, at the bottom I see 0/5 peers connected, 1/5, sometimes even 2/5 and then the app crashes If I deactivate Rosenpass (all other hosts are linux computers with permissive Rosenpass activated), the app connects and is running. Permissive Rosenpass on the Android client plays no role for the crash

Client is v0.1.9

Logfile:

09-22 01:40:57.318 26441 26823 E GoLog   : [DEBG] [peer: acZUdDy0idX1Gmi5VXeKwCpCH+25d3lzGPepKq9PbjY=] client/internal/peer/wg_watcher.go:50 enable WireGuard watcher
09-22 01:40:57.318 26441 26823 E GoLog   : [INFO] [peer: acZUdDy0idX1Gmi5VXeKwCpCH+25d3lzGPepKq9PbjY=] client/internal/peer/wg_watcher.go:90 WireGuard watcher started
09-22 01:40:57.355 26441 26503 E GoLog   : [INFO] [peer: 6jSs6ZRt4D9P9VibhIGhufh+HvvuBGW1f/SM4Hb5FzQ=] client/internal/peer/conn.go:500 start to communicate with peer via relay
09-22 01:40:57.368 26441 26747 E GoLog   : [DEBG] client/internal/rosenpass/manager.go:186 received remote rosenpass key 1cae0cec0679fa92003328f4225c3ea525bcbe26cd38b2ec33430c08c53f2faa, my key 036ca70cf7627eaa08a0c6b740e3778fc9dccfd3853166789e0579a382b18189
09-22 01:40:57.372 26441 26747 I GoLog   : time=2025-09-21T23:40:57.372Z level=DEBUG msg="Skipping handshake due to missing endpoint" pid="s//d4x53F7ixH7oKHCOGLnbKOz01nXyrihG9rwwtvSk="
09-22 01:40:57.374 26441 26747 I GoLog   : time=2025-09-21T23:40:57.374Z level=DEBUG msg="Added peer" pid="s//d4x53F7ixH7oKHCOGLnbKOz01nXyrihG9rwwtvSk="
09-22 01:40:57.376 26441     0 E Go      : panic: runtime error: invalid memory address or nil pointer dereference
09-22 01:40:57.376 26441     0 E Go      : [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x77fc64cfa0]
09-22 01:40:57.376 26441     0 E Go      : 
09-22 01:40:57.376 26441     0 E Go      : goroutine 402 [running]:
09-22 01:40:57.377 26441     0 E Go      : github.com/netbirdio/netbird/client/internal/rosenpass.(*NetbirdHandler).AddPeer(...)
09-22 01:40:57.377 26441     0 E Go      : 	/home/runner/work/android-client/android-client/netbird/client/internal/rosenpass/netbird_handler.go:44
09-22 01:40:57.377 26441     0 E Go      : github.com/netbirdio/netbird/client/internal/rosenpass.(*Manager).addPeer(0x40000fbd80, {0x4003512000?, 0x77fb8e8457?, 0x2b?}, {0x40007e4d26, 0x6}, {0x4000426560, 0xc}, {0x40007ae540, 0x2c})
09-22 01:40:57.377 26441     0 E Go      : 	/home/runner/work/android-client/android-client/netbird/client/internal/rosenpass/manager.go:86 +0x2b0
09-22 01:40:57.377 26441     0 E Go      : github.com/netbirdio/netbird/client/internal/rosenpass.(*Manager).OnConnected(0x40000fbd80, {0x40007ae540, 0x2c}, {0x4003512000, 0x7ff80, 0x80000}, {0x4000426560, 0xc}, {0x40007e4d26, 0x6})
09-22 01:40:57.377 26441     0 E Go      : 	/home/runner/work/android-client/android-client/netbird/client/internal/rosenpass/manager.go:188 +0x1b0
09-22 01:40:57.377 26441     0 E Go      : github.com/netbirdio/netbird/client/internal/peer.(*Conn).doOnConnected(0x4000144b48, {0x4003512000, 0x7ff80, 0x80000}, {0x40007e4d26, 0x6})
09-22 01:40:57.377 26441     0 E Go      : 	/home/runner/work/android-client/android-client/netbird/client/internal/peer/conn.go:622 +0xa8
09-22 01:40:57.377 26441     0 E Go      : github.com/netbirdio/netbird/client/internal/peer.(*Conn).onRelayConnectionIsReady(0x4000144b48, {{0x77fcb379f0, 0x400092c740}, {0x4003512000, 0x7ff80, 0x80000}, {0x40007e4d26, 0x6}})
09-22 01:40:57.377 26441     0 E Go      : 	/home/runner/work/android-client/android-client/netbird/client/internal/peer/conn.go:501 +0x524
09-22 01:40:57.377 26441     0 E Go      : created by github.com/netbirdio/netbird/client/internal/peer.(*WorkerRelay).OnNewOffer in goroutine 431
09-22 01:40:57.377 26441     0 E Go      : 	/home/runner/work/android-client/android-client/netbird/client/internal/peer/worker_relay.go:89 +0x3fc
09-22 01:40:57.380 26441 26823 F libc    : Fatal signal 6 (SIGABRT), code -6 (SI_TKILL) in tid 26823 (Thread-9), pid 26441 (.netbird.client)
09-22 01:40:57.552 26828 26828 I crash_dump64: obtaining output fd from tombstoned, type: kDebuggerdTombstoneProto
09-22 01:40:57.556 26828 26828 I crash_dump64: performing dump of process 26441 (target tid = 26823)
09-22 01:40:58.133 26828 26828 F DEBUG   : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
09-22 01:40:58.133 26828 26828 F DEBUG   : Build fingerprint: 'google/panther/panther:16/BP3A.250905.014/13873947:user/release-keys'
09-22 01:40:58.133 26828 26828 F DEBUG   : Kernel Release: '6.1.134-android14-11-g66e758f7d0c0-ab13748739'
09-22 01:40:58.133 26828 26828 F DEBUG   : Revision: 'MP1.0'
09-22 01:40:58.133 26828 26828 F DEBUG   : ABI: 'arm64'
09-22 01:40:58.133 26828 26828 F DEBUG   : Timestamp: 2025-09-22 01:40:57.568077611+0200
09-22 01:40:58.133 26828 26828 F DEBUG   : Process uptime: 429s
09-22 01:40:58.133 26828 26828 F DEBUG   : Executable: /system/bin/app_process64
09-22 01:40:58.133 26828 26828 F DEBUG   : Cmdline: io.netbird.client
09-22 01:40:58.133 26828 26828 F DEBUG   : pid: 26441, tid: 26823, name: Thread-9  >>> io.netbird.client <<<
09-22 01:40:58.133 26828 26828 F DEBUG   : uid: 10586
09-22 01:40:58.133 26828 26828 F DEBUG   : tagged_addr_ctrl: 0000000000000001 (PR_TAGGED_ADDR_ENABLE)
09-22 01:40:58.133 26828 26828 F DEBUG   : esr: 0000000092000006 (Data Abort Exception 0x24)
09-22 01:40:58.133 26828 26828 F DEBUG   : signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr --------
09-22 01:40:58.133 26828 26828 F DEBUG   :     x0  0000000000000000  x1  00000000000068c7  x2  0000000000000006  x3  0000000000000008
09-22 01:40:58.133 26828 26828 F DEBUG   :     x4  0000000000000001  x5  0000000000000001  x6  0000000000000000  x7  000000000000000a
09-22 01:40:58.133 26828 26828 F DEBUG   :     x8  0000000000000083  x9  0000000000000068  x10 000000000000000a  x11 00000077fce46696
09-22 01:40:58.133 26828 26828 F DEBUG   :     x12 0000000000000005  x13 00000077fb9a6e28  x14 000000000000015f  x15 0000000000000160
09-22 01:40:58.133 26828 26828 F DEBUG   :     x16 000000792b660b90  x17 000000792b758040  x18 000000792a914000  x19 0000000000006749
09-22 01:40:58.133 26828 26828 F DEBUG   :     x20 000000792b757ee0  x21 0000004000500e08  x22 000000000000001b  x23 a52e58d4052c4f3b
09-22 01:40:58.133 26828 26828 F DEBUG   :     x24 9ba5ea8ef721fc48  x25 9404eb699e97b0b3  x26 0000000000000000  x27 0000000000000010
09-22 01:40:58.133 26828 26828 F DEBUG   :     x28 0000004000a141c0  x29 00000040009459a8
09-22 01:40:58.133 26828 26828 F DEBUG   :     lr  00000077fbb31da4  sp  00000040009459b0  pc  00000077fbb5a838  pst 0000000080001000
09-22 01:40:58.133 26828 26828 F DEBUG   :     esr 0000000092000006
09-22 01:40:58.133 26828 26828 F DEBUG   : 1 total frames
09-22 01:40:58.133 26828 26828 F DEBUG   : backtrace:
09-22 01:40:58.133 26828 26828 F DEBUG   :       #00 pc 00000000006cc838  /data/app/~~n13lt5ROCw1v7m7VQsgZ7Q==/io.netbird.client-L30xqqu09SGFpnUjxS2UjQ==/split_config.arm64_v8a.apk (offset 0x8000) (runtime.raise.abi0+40)
09-22 01:40:59.574 26855 26855 I Zygote  : Process 26855 crated for io.netbird.client
09-22 01:40:59.575 26855 26855 I .netbird.client: Using CollectorTypeCMC GC.
09-22 01:40:59.582 26855 26855 E .netbird.client: Not starting debugger since process cannot load the jdwp agent.
09-22 01:40:59.585 26855 26855 D nativeloader: Load libframework-connectivity-tiramisu-jni.so using APEX ns com_android_tethering for caller /apex/com.android.tethering/javalib/framework-connectivity-t.jar: ok
09-22 01:40:59.598 26855 26855 D ApplicationLoaders: Returning zygote-cached class loader: /system_ext/framework/androidx.window.extensions.jar
09-22 01:40:59.598 26855 26855 D ApplicationLoaders: Returning zygote-cached class loader: /system_ext/framework/androidx.window.sidecar.jar
09-22 01:40:59.604 26855 26855 W .netbird.client: Failed to find entry 'classes.dex': Entry not found

09-22 01:40:59.604 26855 26855 W .netbird.client: Failed to find entry 'classes.dex': Entry not found

jhf2442 avatar Sep 22 '25 19:09 jhf2442

Same issue

davidsmith91 avatar Oct 19 '25 09:10 davidsmith91

Referring to the other linked issue, gave both the logs and the files related to the logs of the user xcslabs to ChatGPT 5.0 Extended Thinking.

This is the answer ChatGPT gave, and possibly the solution or at least a start

I don't know who I should tag (who are the main developers) Please help me tag them so it can get fixed


CHATGPT Answer: Here’s a concise comment you can paste into the GitHub issue—what’s happening, why it crashes on Android (GrapheneOS/Xiaomi, etc.), and what the proper fix is so Rosenpass actually works (not just hiding the panic).


Summary (user perspective) The app panics when a peer connects. This is not just a crash: it also means Rosenpass isn’t actually applying preshared keys to WireGuard on Android, so you don’t get the promised “quantum-resistance.” The code currently masks an initialization error and then dereferences a nil handler.


What’s happening

  • The crash is a nil-pointer panic in rosenpass.NetbirdHandler.AddPeer called from rosenpass.Manager.addPeer during OnConnected.

  • In netbird/client/internal/rosenpass/manager.go, generateConfig() calls:

    // file: netbird/client/internal/rosenpass/manager.go
    m.rpWgHandler, _ = NewNetbirdHandler(m.preSharedKey, m.ifaceName) // error ignored
    

    If NewNetbirdHandler fails, it returns (nil, err). Because the error is ignored, m.rpWgHandler stays nil. Later, addPeer(...) calls m.rpWgHandler.AddPeer(...) → nil receiver → panic.

  • Why does NewNetbirdHandler fail? In netbird/client/internal/rosenpass/netbird_handler.go it does:

    // file: netbird/client/internal/rosenpass/netbird_handler.go
    hdlr.client, err = wgctrl.New()
    

    On Android (GrapheneOS, Xiaomi, etc.), wgctrl.New() commonly fails because the process can’t access the kernel WG API or a userspace WG UAPI socket at the expected path. SELinux/permissions and non-standard UAPI locations are typical causes. So the handler never initializes.

Bottom line: ignoring that constructor error both causes the panic and can make the app look like Rosenpass is active when it can’t actually apply keys.


Why this matters

Without a working handler, Rosenpass never updates the WireGuard peer’s preshared key after the handshake. That means you don’t get the intended post-quantum protection on Android—even if logs/UI hint that Rosenpass is “enabled.”


Best fix (not just hiding the crash)

A. Fail fast instead of continuing with a nil handler Propagate the error and abort Rosenpass startup if the handler can’t be built:

// file: netbird/client/internal/rosenpass/manager.go
- m.rpWgHandler, _ = NewNetbirdHandler(m.preSharedKey, m.ifaceName)
+ var err error
+ m.rpWgHandler, err = NewNetbirdHandler(m.preSharedKey, m.ifaceName /* see B */)
+ if err != nil {
+     return rp.Config{}, fmt.Errorf("creating rosenpass wg handler: %w", err)
+ }

(Also add a nil check before using m.rpWgHandler in addPeer / removePeer to prevent any future regressions.)

B. Make Rosenpass work on Android by removing the hard dependency on wgctrl On Android, don’t rely on wgctrl. NetBird already configures WG locally via its own interfaces (WGIface / EndpointUpdater). Use those to apply the PSK that comes from Rosenpass:

  1. Introduce a small interface the handler can use to set a peer’s PSK and read current peers:
// file: netbird/client/internal/rosenpass/wg_configurator.go
type WGConfigurator interface {
    UpdatePeerPSK(pub wgtypes.Key, psk *wgtypes.Key, endpoint *net.UDPAddr, allowedIPs []wgtypes.IPNet) error
    PeerSnapshot() ([]wgtypes.Peer, error)
    RemovePeer(pub wgtypes.Key) error
}
  1. Provide an Android implementation of WGConfigurator that calls the existing NetBird Android WG code (no wgctrl):
// file: netbird/client/internal/rosenpass/wg_configurator_android.go
//go:build android
type wgConfiguratorAndroid struct { /* holds your WGIface */ }
  1. Change NetbirdHandler to depend on WGConfigurator instead of wgctrl.Client:
// file: netbird/client/internal/rosenpass/netbird_handler.go
- client *wgctrl.Client
+ cfg    WGConfigurator

…and replace client.Device(...) / ConfigureDevice(...) with cfg.PeerSnapshot(...) / cfg.UpdatePeerPSK(...) / cfg.RemovePeer(...).

  1. In manager.go, construct the right configurator per platform and pass it to NewNetbirdHandler(...). On Android, use the Android adapter; on desktop/server builds you can keep a default wgctrl-based adapter.

C. Don’t advertise “Rosenpass enabled” until it actually applied a key Gate any “Rosenpass enabled” / status bit behind a successful call to UpdatePeerPSK(...) (e.g., set a simple “healthy” flag in the handler after the first successful apply). That prevents false positives when the handler couldn’t initialize.


Why this is the right approach

  • It surfaces the real error (instead of crashing later on a nil) and avoids misleading users.
  • It makes Rosenpass actually functional on Android (GrapheneOS, Xiaomi, etc.) by using the configuration path that already works there.
  • It preserves the existing wgctrl path on platforms where it does work.
  • It gives a clear health signal so the UI/logs only claim “Rosenpass enabled” when keys are actually being rotated.

Minimal acceptance criteria / testing

  1. On Android, connect two peers and observe that after handshake the handler logs a successful PSK apply (e.g., “applied rosenpass PSK to ”).
  2. Verify via your WG state that the peer’s preshared key changes post-handshake.
  3. Ensure the app does not mark Rosenpass enabled (or silently continue) if the handler cannot initialize.

If maintainers prefer to keep wgctrl everywhere, they’ll need to ensure a working UAPI path and permissions on Android—which is brittle across devices/ROMs. The adapter approach above is more robust and keeps the promise of Rosenpass on Android without special privileges.

davidsmith91 avatar Oct 19 '25 10:10 davidsmith91

+1, crash with Rosenpass

Exitium-DEV avatar Oct 20 '25 03:10 Exitium-DEV

How can it be that this issue is not even acknowledged?

Blacks-Army avatar Oct 30 '25 11:10 Blacks-Army

This issue still persists with v0.2.2.

@pappz Sorry for the ping, but is the team aware of this issue? It has been ongoing for a few months, and there hasn’t been any response or acknowledgement yet.

Blacks-Army avatar Nov 27 '25 23:11 Blacks-Army

@doromaraujo could you take a look? This looks valid issue based on the source code.

pappz avatar Nov 27 '25 23:11 pappz