sing-box
sing-box copied to clipboard
[Android] WireGuard does not work properly after network outage and recovery
Operating system
Android
System version
lineageos 20
Installation type
sing-box for Android Graphical Client
If you are using a graphical client, please provide the version of the client.
1.8.4
Version
No response
Description
The WireGuard works properly on startup. However, after disconnecting and reconnecting the network, it cannot handshake with the server.
Reproduction
- Starting the sing-box
- Check we can access the WireGuard network by
curl http://192.168.2.1
, and it works. - Turn off the network, and then turn it back on.
- Wait a while, and try to access
http://192.168.2.1
again, it does not work.
I dump the traffic on the server(see logs at the end), It seems that server can receive data from sing-box and send data to sing-box.
configuration
{
"log": { "level": "debug" },
"dns": {
"servers": [
{
"tag": "home-dns",
"address": "udp://192.168.6.1",
"detour": "direct",
"strategy": "ipv4_only"
},
{
"tag": "wg-dns",
"address": "udp://192.168.2.6",
"detour": "go-home",
"strategy": "ipv4_only"
},
{
"tag": "default-dns",
"strategy": "ipv4_only",
"address": "h3://223.5.5.5/dns-query",
"detour": "direct"
}
],
"rules": [
{
"domain_suffix": [".home.example.com"],
"wifi_ssid": ["home-dns"],
"server": "family"
},
{
"domain_suffix": [".home.example.com"],
"server": "wg-dns"
}
],
"final": "default-dns"
},
"inbounds": [
{
"type": "tun",
"tag": "tun-in",
"interface_name": "tun0",
"inet4_address": "172.19.0.1/30",
"inet6_address": "fdfe:2204:cfab::1/126",
"mtu": 9000,
"auto_route": true,
"strict_route": true,
"inet4_route_address": ["0.0.0.0/1", "128.0.0.0/1"],
"inet6_route_address": ["::/1", "8000::/1"],
"endpoint_independent_nat": false,
"stack": "system",
"sniff": true
}
],
"outbounds": [
{ "type": "direct", "tag": "direct" },
{ "type": "block", "tag": "block" },
{ "type": "dns", "tag": "dns" },
{
"type": "wireguard",
"tag": "go-home",
"local_address": ["10.249.0.3/32"],
"private_key": "KNx4llKEZwqB5Q69MMVlFfj+7pVaRIFiw63tkSvblmA=",
"peers": [
{
"server": "home.example.com",
"server_port": 51802,
"public_key": "DBjU7sR7/Qx65b6m4IKTAZrjDHBeWsruMyoSpV1ES1U=",
"allowed_ips": ["192.168.2.0/24", "10.249.0.0/24"]
}
]
}
],
"route": {
"final": "direct",
"auto_detect_interface": true,
"rules": [
{ "protocol": "dns", "outbound": "dns" },
{
"wifi_ssid": ["abc"],
"ip_cidr": ["192.168.2.0/24", "10.249.0.0/24"],
"outbound": "direct"
},
{
"ip_cidr": ["192.168.2.0/24", "10.249.0.0/24"],
"outbound": "go-home"
}
]
}
}
sing-box logs
tcpdump on server
before network disconnect
21:37:53.245737 pppoe-wan In IP 180.139.224.173.24192 > 124.227.226.83.51802: UDP, length 96
21:37:53.248727 pppoe-wan In IP 180.139.224.173.24192 > 124.227.226.83.51802: UDP, length 96
21:37:53.249471 pppoe-wan Out IP 124.227.226.83.51802 > 180.139.224.173.24192: UDP, length 96
21:37:53.279670 pppoe-wan In IP 180.139.224.173.24192 > 124.227.226.83.51802: UDP, length 96
21:38:03.482498 pppoe-wan Out IP 124.227.226.83.51802 > 180.139.224.173.24192: UDP, length 32
21:38:03.505715 pppoe-wan In IP 180.139.224.173 > 124.227.226.83: ICMP 180.139.224.173 udp port 24192 unreachable, length 68
After network recovery
21:38:04.607864 pppoe-wan In IP 180.139.224.173.24206 > 124.227.226.83.51802: UDP, length 96
21:38:04.608708 pppoe-wan Out IP 124.227.226.83.51802 > 180.139.224.173.24206: UDP, length 96
21:38:05.511612 pppoe-wan In IP 180.139.224.173.24206 > 124.227.226.83.51802: UDP, length 96
21:38:05.512196 pppoe-wan Out IP 124.227.226.83.51802 > 180.139.224.173.24206: UDP, length 80
21:38:05.581322 pppoe-wan In IP 180.139.224.173.24206 > 124.227.226.83.51802: UDP, length 96
21:38:05.581891 pppoe-wan Out IP 124.227.226.83.51802 > 180.139.224.173.24206: UDP, length 96
21:38:06.602565 pppoe-wan Out IP 124.227.226.83.51802 > 180.139.224.173.24206: UDP, length 96
Logs
No response
Integrity requirements
- [X] I confirm that I have read the documentation, understand the meaning of all the configuration items I wrote, and did not pile up seemingly useful options or default values.
- [X] I confirm that I have provided the server and client configuration files and process that can be reproduced locally, instead of a complicated client configuration file that has been stripped of sensitive data.
- [X] I confirm that I have provided the simplest configuration that can be used to reproduce the error I reported, instead of depending on remote servers, TUN, graphical interface clients, or other closed-source software.
- [X] I confirm that I have provided the complete configuration files and logs, rather than just providing parts I think are useful out of confidence in my own intelligence.
Ran into similar issue on Linux with "auto_detect_interface": true
.
Everything works fine before the interfaces' changing:
DEBUG outbound/wireguard[warp]: peer(bmXO…fgyo) - received handshake response
DEBUG outbound/wireguard[warp]: peer(bmXO…fgyo) - sending keepalive packet
INFO router: updated default interface eth0, index 2
DEBUG outbound/wireguard[warp]: routine: receive incoming receive - stopped
DEBUG outbound/wireguard[warp]: peer(bmXO…fgyo) - retrying handshake because we stopped hearing back after 15 seconds
DEBUG outbound/wireguard[warp]: peer(bmXO…fgyo) - sending handshake initiation
DEBUG outbound/wireguard[warp]: peer(bmXO…fgyo) - handshake did not complete after 5 seconds, retrying (try 2)
DEBUG outbound/wireguard[warp]: peer(bmXO…fgyo) - handshake did not complete after 5 seconds, retrying (try 3)
DEBUG outbound/wireguard[warp]: peer(bmXO…fgyo) - handshake did not complete after 5 seconds, retrying (try 4)
All the handshakes fail and never succeed again.
I guess it's a bug. I'll try to find a minimal reproduce.
With small patch:
diff --git a/outbound/wireguard.go b/outbound/wireguard.go
index 045241f..c08c6b8 100644
--- a/outbound/wireguard.go
+++ b/outbound/wireguard.go
@@ -165,7 +165,10 @@ func (w *WireGuard) Close() error {
}
func (w *WireGuard) InterfaceUpdated() {
- w.device.BindUpdate()
+ err := w.device.BindUpdate()
+ if err != nil {
+ w.logger.Error("InterfaceUpdated ", err)
+ }
return
}
INFO router: updated default interface eth0, index 2
DEBUG outbound/wireguard[warp]: routine: receive incoming receive - stopped
ERROR outbound/wireguard[warp]: InterfaceUpdated use of closed network connection
INFO router: updated default interface wlp2s0, index 3
ERROR outbound/wireguard[warp]: InterfaceUpdated use of closed network connection
INFO router: updated default interface eth0, index 2
ERROR outbound/wireguard[warp]: InterfaceUpdated use of closed network connection
Try https://github.com/SagerNet/sing-box/commit/dd52c26ae1bd6751b99d75d315048d71c592f033
https://github.com/SagerNet/sing-box/commit/dd52c26ae1bd6751b99d75d315048d71c592f033 with v1.8.6
got the same errors, but, my bad, I didn't mention that I'm using detour
and system_interface
with wireguard outbound:
{
"detour": "auto:proxy",
"interface_name": "warp",
"system_interface": true,
"tag": "warp",
"type": "wireguard"
...
}
I'm trying to give a minimal reproduce.
{
"inbounds": [
{
"listen": "0.0.0.0",
"listen_port": 1080,
"type": "mixed"
}
],
"log": {
"disabled": false,
"level": "trace",
"timestamp": true
},
"outbounds": [
{
"tag": "warp",
"detour": "proxy",
"system_interface": false,
"type": "wireguard",
???
},
{
"tag": "proxy",
"type": "vmess",
???
}
],
"route": {
"auto_detect_interface": true,
"final": "warp"
}
}
I have two network interfaces eth0
and wlp2s0
, I can reproduce the errors with making the eth0
plugged and unplugged.
Similar problem. When I enable Wireguard in Sing-Box on my Android phone outside of my home via mobile internet, it works fine. When I come home and my phone connects to my home Wi-Fi, the Internet on my phone disappears, and in order to get it back I have to shut down the Sing-Box. Sing-Box version 1.8.8 and Android 14. Upd. I checked on 1.9.0-beta.8 - the same problem exists.
I think these are caused by incorrect/stale bound/connected UDP socket.
Currently WireGuard transport creates and connects the underlying UDP socket on start, and uses the same UDP socket for subsequent send/recv. When connected, this UDP socket will bind to a local IP and port.
After network change/recovery, the host's IP address will change, and this UDP socket's local IP address is no long available. The socket API doesn't give any error for UDP on this socket, so it will seem sending successfully (althouth the packet may or may not arrive at the destation) and will receive nothing afterward.
This undetected dead UDP socket also cause problems for IPv6. Some ISP will change your prefix periodically, the host's IPv6 address will change and kill the previously bound UDP socket. And during startup, when the IPv6 address is in tentative state, the connect will succeed but bind to a link local IPv6 address, which also leave a dead socket.
I think the above conditions can be simulated by manually delete/change host's ( bound UDP socket's ) local IP address and tested using docker/netcat.
If we can't easily detect this, maybe we can just recreate/reconnect the UDP socket if haven't received anything for a specific duration.
After network changes/restoration, as well as when using the Clash API to disconnect all connections, the same situation occurs where the WireGuard connection fails to automatically restore.
My WireGuard configuration with an upstream, deployed on a side Linux device (LXC container in Proxmox ).
I had the same issue on Android 14 with Sing-Box 1.8.9. While setting "gso":true
in the Wireguard outbound configuration fixed the connection drop after switching networks, it now takes about 30 seconds to come back online.
I had the same issue on Android 14 with Sing-Box 1.8.9. While setting
"gso":true
in the Wireguard outbound configuration fixed the connection drop after switching networks, it now takes about 30 seconds to come back online.
Thanks for the tip, it worked for me! There are no more wireguard connection drops when moving from one network to another. In any case, it is not noticeable at all, not 30 seconds, not even one second. Android 14, arm64-v8a and 1.9.0-beta.16.
Try f61b272cbf3732ac7d8307ee787963ba78ca5945
https://github.com/SagerNet/sing-box/commit/f61b272cbf3732ac7d8307ee787963ba78ca5945 works for me, with 1.8.9
03:24:39 INFO router: updated default interface wlp2s0, index 3
03:24:39 DEBUG outbound/wireguard[warp]: routine: receive incoming receive - stopped
03:24:39 DEBUG outbound/wireguard[warp]: udp bind has been updated
03:24:39 DEBUG outbound/wireguard[warp]: routine: receive incoming receive - started
03:24:39 INFO outbound/vmess[proxy-1]: outbound packet connection to 162.159.192.1:2408
03:25:03 DEBUG outbound/wireguard[warp]: peer(bmXO…fgyo) - sending handshake initiation
03:25:03 DEBUG outbound/wireguard[warp]: peer(bmXO…fgyo) - received handshake response
03:25:03 DEBUG outbound/wireguard[warp]: peer(bmXO…fgyo) - sending keepalive packet
03:25:20 DEBUG outbound/wireguard[warp]: peer(bmXO…fgyo) - retrying handshake because we stopped hearing back after 15 seconds
03:25:20 DEBUG outbound/wireguard[warp]: peer(bmXO…fgyo) - sending handshake initiation
03:25:20 DEBUG outbound/wireguard[warp]: peer(bmXO…fgyo) - received handshake response
03:25:20 DEBUG outbound/wireguard[warp]: peer(bmXO…fgyo) - sending keepalive packet
03:25:21 INFO router: updated default interface eth0, index 2
03:25:21 DEBUG outbound/wireguard[warp]: routine: receive incoming receive - stopped
03:25:21 DEBUG outbound/wireguard[warp]: udp bind has been updated
03:25:21 DEBUG outbound/wireguard[warp]: routine: receive incoming receive - started
03:25:21 INFO outbound/vmess[proxy-1]: outbound packet connection to 162.159.192.1:2408
03:25:36 DEBUG outbound/wireguard[warp]: peer(bmXO…fgyo) - sending keepalive packet
03:25:37 INFO router: updated default interface wlp2s0, index 3
03:25:37 DEBUG outbound/wireguard[warp]: routine: receive incoming receive - stopped
03:25:37 DEBUG outbound/wireguard[warp]: udp bind has been updated
03:25:37 DEBUG outbound/wireguard[warp]: routine: receive incoming receive - started
03:25:37 INFO outbound/vmess[proxy-1]: outbound packet connection to 162.159.192.1:2408
On versions 1.8.10 - 1.8.14 and 1.9.0-rc.1 - 1.9.0-rc.22 the application interface stops responding to actions with it after switching from wifi to mobile Internet if the configuration has active wireguard outbounds without "gso": true. Android 14, arm64-v8a.
Thanks for those tips man; "gso": true really does work for me! Gosh I've had this problem with sing-box forever ago and always wondered if it was just me