tailscale
tailscale copied to clipboard
No internet connection when using exit node on windows
What is the issue?
Enabling any exit node on my windows computer breaks internet connectivity. All other devices on my tailnet can connect to and use all of my exit nodes as expected.
With no exit node set, the desktop can get a direct connection to the exit node. As soon as any exit node is used, connecting to the internet or any node in the tailnet fails.
Initial state, connectivity is fine:
$ tailscale ping vps1
pong from vps1 (100.68.228.146) via a.b.c.d:41641 in 7ms
$ tailscale status
100.89.183.91 desktop TechnoJo4@ windows -
(... other devices)
100.68.228.146 vps1 TechnoJo4@ linux active; offers exit node; direct a.b.c.d:41641
Pings to tailscale IPs and the wider internet all time out after setting the exit node:
$ tailscale set --exit-node=vps1
$ tailscale ping vps1
ping "100.68.228.146" timed out
(... x10)
no reply
$ tailscale status
100.89.183.91 desktop TechnoJo4@ windows offline
(... other devices)
100.68.228.146 vps1 TechnoJo4@ linux active; exit node; relay "tor", tx 2220 rx 0
# Health check:
# - not in map poll
Setting no exit node restores connectivity:
$ tailscale set --exit-node=
$ tailscale ping vps1
pong from vps1 (100.68.228.146) via a.b.c.d:41641 in 6ms
Tested exit nodes were Arch Linux on OVH, and Ubuntu on Oracle Cloud.
All other client devices are on the same network (eduroam) as the desktop. These other devices successfully connect to and use the exit nodes, and run Android and Arch Linux.
I'm not entirely sure whether this is actually distinct of #9199 but the OP and recent comments on that issue being inconsistent with my situation leads me to assume this isn't a duplicate.
Steps to reproduce
No response
Are there any recent changes that introduced the issue?
This started shortly after upgrading all my devices to 1.56.1, but the issue persisted after downgrading both the exit node and the windows device to 1.54.1, and I had no issue on that version previously. All nodes are now on 1.56.1 again.
I did not manually perform any other configuration changes.
OS
Windows
OS version
Windows 11
Tailscale version
1.56.1
Other software
Wireguard was previously installed but not running.
Bug report
BUG-f43bb2fcec2063a8f7be04966370cadd3b0d225b37843a272a9930c7bdf45f1a-20240113021614Z-af9973daa2c661b1
2024-01-13 02:15:43 | dns udp query: context deadline exceeded
2024-01-13 02:15:43 | control: controlhttp: trying to dial "controlplane.tailscale.com"
As a troubleshooting step, can you set "Override local DNS" in the DNS tab and take a new bugreport?
Peculiarities I have now recently noticed about my (uni's) network:
They are squatting 1.1.1.1 (ipconfig /all shows DHCP Server . . . . . . . . . . . : 1.1.1.1), and so DNS requests to there just time out.
Their DNS server is on 10.32.x.y my IP is on 10.238.x.y, yet the subnet mask is 255.255.240.0.
I've now set my DNS to Quad9 everywhere (which i've confirmed works with the exit node disabled, using nslookup "controlplane.tailscale.com" 9.9.9.9, unlike 1.1.1.1), and enabled Override local DNS in Tailscale, as instructed. The issue persists; here's a new bugreport:
BUG-f43bb2fcec2063a8f7be04966370cadd3b0d225b37843a272a9930c7bdf45f1a-20240131201023Z-f996b319b0cd374f
I got more info and an extremely hacky workaround. It seems to be some kind of route issue? This won't be a rigorous explanation but hopefully it helps at identifying and fixing the underlying issue.
I ran wireshark, looked at traffic on the Wi-Fi and Tailscale interfaces, and noticed traffic seems to just all be going to the tailscale interface and then never to the real network, including packets to controlplane and the derp relays. I've been throwing shit at the wall for weeks now so as another random guess I just decided to open up the tailscale logs, search "timeout opening", and route add every IP I saw, with my Wi-Fi's gateway as the gateway, forcing traffic to go there.
This seems to have worked? tailscale ping vps1 finally got a reply, though via DERP, and then another route add with the IP of my VPS later, I have a direct connection and internet connectivity through my exit node works again.
Not sure if there are any security (or otherwise) implications to this workaround and I'd rather avoid running it every time I restart my PC, so hopefully resolution is easier from this. I can give another bugreport or any other information if that'd be helpful.
Did you enable IP Forwarding on your VPS?
https://tailscale.com/kb/1103/exit-nodes?tab=linux#enable-ip-forwarding
Did you enable IP Forwarding on your VPS?
Yes, I have verified many times. The exit node works on all other devices. I'm only running into this issue and need to run this workaround on my Windows PC.