tailscale icon indicating copy to clipboard operation
tailscale copied to clipboard

No internet connection when using exit node on windows

Open TechnoJo4 opened this issue 1 year ago • 3 comments

What is the issue?

Enabling any exit node on my windows computer breaks internet connectivity. All other devices on my tailnet can connect to and use all of my exit nodes as expected.

With no exit node set, the desktop can get a direct connection to the exit node. As soon as any exit node is used, connecting to the internet or any node in the tailnet fails.

Initial state, connectivity is fine:

$ tailscale ping vps1
pong from vps1 (100.68.228.146) via a.b.c.d:41641 in 7ms

$ tailscale status
100.89.183.91   desktop              TechnoJo4@   windows -
(... other devices)
100.68.228.146  vps1                 TechnoJo4@   linux   active; offers exit node; direct a.b.c.d:41641

Pings to tailscale IPs and the wider internet all time out after setting the exit node:

$ tailscale set --exit-node=vps1

$ tailscale ping vps1
ping "100.68.228.146" timed out
(... x10)
no reply

$ tailscale status
100.89.183.91   desktop              TechnoJo4@   windows offline
(... other devices)
100.68.228.146  vps1                 TechnoJo4@   linux   active; exit node; relay "tor", tx 2220 rx 0

# Health check:
#     - not in map poll

Setting no exit node restores connectivity:

$ tailscale set --exit-node=

$ tailscale ping vps1
pong from vps1 (100.68.228.146) via a.b.c.d:41641 in 6ms

Tested exit nodes were Arch Linux on OVH, and Ubuntu on Oracle Cloud.

All other client devices are on the same network (eduroam) as the desktop. These other devices successfully connect to and use the exit nodes, and run Android and Arch Linux.

I'm not entirely sure whether this is actually distinct of #9199 but the OP and recent comments on that issue being inconsistent with my situation leads me to assume this isn't a duplicate.

Steps to reproduce

No response

Are there any recent changes that introduced the issue?

This started shortly after upgrading all my devices to 1.56.1, but the issue persisted after downgrading both the exit node and the windows device to 1.54.1, and I had no issue on that version previously. All nodes are now on 1.56.1 again.

I did not manually perform any other configuration changes.

OS

Windows

OS version

Windows 11

Tailscale version

1.56.1

Other software

Wireguard was previously installed but not running.

Bug report

BUG-f43bb2fcec2063a8f7be04966370cadd3b0d225b37843a272a9930c7bdf45f1a-20240113021614Z-af9973daa2c661b1

TechnoJo4 avatar Jan 13 '24 02:01 TechnoJo4

2024-01-13 02:15:43 | dns udp query: context deadline exceeded
2024-01-13 02:15:43 | control: controlhttp: trying to dial "controlplane.tailscale.com"

As a troubleshooting step, can you set "Override local DNS" in the DNS tab and take a new bugreport?

kelivel avatar Jan 29 '24 16:01 kelivel

Peculiarities I have now recently noticed about my (uni's) network: They are squatting 1.1.1.1 (ipconfig /all shows DHCP Server . . . . . . . . . . . : 1.1.1.1), and so DNS requests to there just time out. Their DNS server is on 10.32.x.y my IP is on 10.238.x.y, yet the subnet mask is 255.255.240.0.

I've now set my DNS to Quad9 everywhere (which i've confirmed works with the exit node disabled, using nslookup "controlplane.tailscale.com" 9.9.9.9, unlike 1.1.1.1), and enabled Override local DNS in Tailscale, as instructed. The issue persists; here's a new bugreport: BUG-f43bb2fcec2063a8f7be04966370cadd3b0d225b37843a272a9930c7bdf45f1a-20240131201023Z-f996b319b0cd374f

TechnoJo4 avatar Jan 31 '24 20:01 TechnoJo4

I got more info and an extremely hacky workaround. It seems to be some kind of route issue? This won't be a rigorous explanation but hopefully it helps at identifying and fixing the underlying issue.

I ran wireshark, looked at traffic on the Wi-Fi and Tailscale interfaces, and noticed traffic seems to just all be going to the tailscale interface and then never to the real network, including packets to controlplane and the derp relays. I've been throwing shit at the wall for weeks now so as another random guess I just decided to open up the tailscale logs, search "timeout opening", and route add every IP I saw, with my Wi-Fi's gateway as the gateway, forcing traffic to go there.

This seems to have worked? tailscale ping vps1 finally got a reply, though via DERP, and then another route add with the IP of my VPS later, I have a direct connection and internet connectivity through my exit node works again.

Not sure if there are any security (or otherwise) implications to this workaround and I'd rather avoid running it every time I restart my PC, so hopefully resolution is easier from this. I can give another bugreport or any other information if that'd be helpful.

TechnoJo4 avatar Feb 14 '24 03:02 TechnoJo4

Did you enable IP Forwarding on your VPS?

https://tailscale.com/kb/1103/exit-nodes?tab=linux#enable-ip-forwarding

wingcomm avatar May 24 '24 08:05 wingcomm

Did you enable IP Forwarding on your VPS?

Yes, I have verified many times. The exit node works on all other devices. I'm only running into this issue and need to run this workaround on my Windows PC.

TechnoJo4 avatar May 24 '24 11:05 TechnoJo4