telepresence
telepresence copied to clipboard
Make telepresence discover that DNS IP is in conflict with a proxied subnet
Background
Users are experiencing loss of network after they do telepresence connect
. This will happen if the DNS is configured to use a server IP that is mapped by a subnet in Telepresence's TUN-device.
A common scenario is an ec2 instance that uses a DNS server at 172.31.0.2
(common default nameserver). Typical output from resolvectl dns
:
$ resolvectl dns
Global:
Link 2 (ens5): 172.31.0.2
When Telepresence connects to an EKS cluster, the cluster uses a pod-subnet with an overlapping IP. The log shows:
<timestamp> info daemon/watch-cluster-info : Adding pod subnet 172.31.0.0/18
While connected, all requests to 172.31.0.2
will be routed to the cluster and the cluster doesn't have a DNS server at that IP. All requests for names covered by the exclude-suffix
list will then fail (the ones not covered will be routed to the cluster's proper DNS and likely succeed).
Workaround
Adding the local DNS IP to the never-proxy
as 172.31.0.2/32
solves the problem. The DNS queries will no longer be routed to the cluster and instead proceed to their original network destination.
Desired improvement
Telepresence should detect this situation and automatically add the conflicting entry to the never-proxy
list.
Alternative
The telepresence test-vpn
could detect this conflict and suggest the aforementioned workaround.
Thanks @thallgren for filing this. One more ask: Please default route entries for the VPN gateway. My SSH finally worked after this :) More context in this issue.
and thanks again for very prompt responses and finding the solution!
@bhavitsharma you're more than welcome. Your input was very helpful in finding the root cause. And now you've also verified that the workaround indeed solves your problem. Thanks!