colima icon indicating copy to clipboard operation
colima copied to clipboard

Unable to connect to hosts through VPN interface from colima VM

Open chriscasola opened this issue 2 years ago • 16 comments

Description

Initially when creating a colima VM, I can do something like colima ssh and then curl -v internal.corporate-domain.com and successfully connect and get back a response.

But at a later time, sometimes weeks/months later, all networking breaks between the VM (and any docker containers running on the VM) and the internal corporate network. Restarting the computer and/or the colima VM does not resolve the issue. The only resolution is to tear down the colima VM and create a new one.

The VPN on the mac creates an interface like this: inet 172.19.21.19 --> 172.19.21.19 netmask 0xffffffff There is also a physical network interface on the mac: inet 192.168.86.42 netmask 0xffffff00 broadcast 192.168.86.255

Here is some debugging output from the colima VM:

successful nslookup

colima:/Users/ccasola$ nslookup xxx-staging.xxx.com
Server:		192.168.107.1
Address:	192.168.107.1:53

Non-authoritative answer:
Can't find xxx-staging.xxx.com: No answer

Non-authoritative answer:
Name:	xxx-staging.xxx.com
Address: 172.21.132.52

curl fails to connect

colima:/Users/ccasola$ curl -v http://xxx-staging.xxx.com/
*   Trying 172.21.132.52:80...
* connect to 172.21.132.52 port 80 failed: Host is unreachable
* Failed to connect to xxx-staging.xxx.com port 80 after 3097 ms: Host is unreachable
* Closing connection 0
curl: (7) Failed to connect to xxx-staging.xxx.com port 80 after 3097 ms: Host is unreachable

ifconfig on colima vm

output
colima:/Users/ccasola$ ifconfig
br-861424ecd59b Link encap:Ethernet  HWaddr 02:42:52:40:74:7C  
          inet addr:172.20.0.1  Bcast:172.20.255.255  Mask:255.255.0.0
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

br-aa39713c361f Link encap:Ethernet  HWaddr 02:42:68:34:DE:D0  
          inet addr:172.21.0.1  Bcast:172.21.255.255  Mask:255.255.0.0
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

br-cf68d4a5c6cf Link encap:Ethernet  HWaddr 02:42:1A:D5:88:4E  
          inet addr:172.18.0.1  Bcast:172.18.255.255  Mask:255.255.0.0
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

docker0   Link encap:Ethernet  HWaddr 02:42:93:3F:AD:47  
          inet addr:172.17.0.1  Bcast:172.17.255.255  Mask:255.255.0.0
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

eth0      Link encap:Ethernet  HWaddr 52:55:55:83:2B:8F  
          inet addr:192.168.5.15  Bcast:0.0.0.0  Mask:255.255.255.0
          inet6 addr: fe80::5055:55ff:fe83:2b8f/64 Scope:Link
          inet6 addr: fec0::5055:55ff:fe83:2b8f/64 Scope:Site
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:405 errors:0 dropped:0 overruns:0 frame:0
          TX packets:354 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:41947 (40.9 KiB)  TX bytes:43449 (42.4 KiB)

eth1      Link encap:Ethernet  HWaddr 5A:94:EF:B8:ED:B2  
          inet addr:192.168.107.2  Bcast:0.0.0.0  Mask:255.255.255.0
          inet6 addr: fe80::5894:efff:feb8:edb2/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:22 errors:0 dropped:0 overruns:0 frame:0
          TX packets:35 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:1668 (1.6 KiB)  TX bytes:2434 (2.3 KiB)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:9 errors:0 dropped:0 overruns:0 frame:0
          TX packets:9 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:792 (792.0 B)  TX bytes:792 (792.0 B)

Version

Colima Version:

colima version 0.4.4
git commit: 8bb1101a861a8b6d2ef6e16aca97a835f65c4f8f

runtime: docker
arch: aarch64
client: v20.10.17
server: v20.10.11

Lima Version:

limactl version 0.11.3

Qemu Version:

qemu-img version 7.0.0
Copyright (c) 2003-2022 Fabrice Bellard and the QEMU Project developers

Operating System

  • [ ] macOS Intel
  • [X] macOS M1
  • [ ] Linux

Reproduction Steps

  1. colima start
  2. colima ssh
  3. curl -v corporate-app.corporation.com

Expected behaviour

curl should be able to connect to the host.

Additional context

This is not isolated to a single person, multiple people have been experiencing this at our company. Thanks in advance for any help you can provide!

chriscasola avatar Aug 09 '22 13:08 chriscasola

@chriscasola any idea in what version you started noticing this?

abiosoft avatar Aug 11 '22 16:08 abiosoft

We're not sure because it has occurred very sporadically. But it's been happening for at least a couple months.

On Aug 11, 2022, at 12:14 PM, Abiola Ibrahim @.***> wrote:

 @chriscasola any idea in what version you started noticing this?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

chriscasola avatar Aug 12 '22 02:08 chriscasola

The only resolution is to tear down the colima VM and create a new one.

@chriscasola you mean that it works fine on a newly created VM but starts to malfunction after a while, until you teardown and recreate it again?

abiosoft avatar Aug 15 '22 05:08 abiosoft

you mean that it works fine on a newly created VM but starts to malfunction after a while, until you teardown and recreate it again?

@abiosoft correct.

chriscasola avatar Aug 15 '22 13:08 chriscasola

@abiosoft this happens daily for us. If there's any steps we can take to debug, while the VM is in a bad state, let me know.

chriscasola avatar Aug 17 '22 14:08 chriscasola

@chriscasola did you install colima via brew?

abiosoft avatar Aug 17 '22 15:08 abiosoft

Yes, I did.

On Wed, Aug 17, 2022 at 11:03 AM Abiola Ibrahim @.***> wrote:

@chriscasola https://github.com/chriscasola did you install colima via brew?

— Reply to this email directly, view it on GitHub https://github.com/abiosoft/colima/issues/392#issuecomment-1218132977, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARNEHGPMXJCQRSTLFH4UYTVZT5LRANCNFSM56A4OCEQ . You are receiving this because you were mentioned.Message ID: @.***>

chriscasola avatar Aug 17 '22 15:08 chriscasola

@chriscasola This PR enables customising the network driver https://github.com/abiosoft/colima/pull/399. ~Once the PR is merged,~ can you try starting afresh with slirp network?

brew install --HEAD colima # install development version
colima delete # delete existing instance for clean behaviour
colima start --network-driver slirp

abiosoft avatar Aug 17 '22 16:08 abiosoft

@abiosoft I tried the slirp driver today but I ran into the same issue after about 6 hours of working.

chriscasola avatar Aug 18 '22 20:08 chriscasola

@abiosoft I tried the slirp driver today but I ran into the same issue after about 6 hours of working.

@chriscasola any idea if this happens after resuming your Mac from sleep? And a simple restart does not fix it?

abiosoft avatar Aug 18 '22 20:08 abiosoft

Restarting the VM usually works but sometimes it does not and I have to delete the VM.

No my Mac did not sleep at all today between creating the new VM and when it hit the issue.

On Aug 18, 2022, at 4:23 PM, Abiola Ibrahim @.***> wrote:

 @abiosoft I tried the slirp driver today but I ran into the same issue after about 6 hours of working.

@chriscasola any idea if this happens after resuming your Mac from sleep? And a simple restart does not fix it?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

chriscasola avatar Aug 18 '22 20:08 chriscasola

@chriscasola can you kindly share the output the following in both scenarios. i.e. when it is working fine and when it stops working.

colima ssh -- ip route

abiosoft avatar Aug 19 '22 17:08 abiosoft

This may be a separate but related issue, but I seem to be hitting connection errors when my Mac switches from ethernet to WiFi. The other day when I initially tried the slirp driver I still eventually had connection issues even though I was on ethernet only the entire time. Here is the output from the command you requested today:

Initial Start (on Ethernet)

$ colima -p slirp ssh -- ip route
default via 192.168.5.2 dev eth0  metric 202 
172.17.0.0/16 dev docker0 scope link  src 172.17.0.1 
172.18.0.0/16 dev br-1331da1855ba scope link  src 172.18.0.1 
192.168.5.0/24 dev eth0 scope link  src 192.168.5.15

After getting connection errors (on WiFi)

$ colima -p slirp ssh -- ip route
default via 192.168.5.2 dev eth0  metric 202 
172.17.0.0/16 dev docker0 scope link  src 172.17.0.1 
172.18.0.0/16 dev br-1331da1855ba scope link  src 172.18.0.1 
172.19.0.0/16 dev br-77ac2751083e scope link  src 172.19.0.1 
192.168.5.0/24 dev eth0 scope link  src 192.168.5.15

After restarting the colima VM (still on WiFi, still getting connection errors)

$ colima -p slirp ssh -- ip route
default via 192.168.5.2 dev eth0  metric 202 
172.17.0.0/16 dev docker0 scope link  src 172.17.0.1 
172.18.0.0/16 dev br-1331da1855ba scope link  src 172.18.0.1 
172.19.0.0/16 dev br-77ac2751083e scope link  src 172.19.0.1 
192.168.5.0/24 dev eth0 scope link  src 192.168.5.15 

chriscasola avatar Aug 22 '22 19:08 chriscasola

Still digging, but it seems like deleting and recreating the docker network I'm using resolves the issue.

I also ran into DNS resolution issues, where the DNS server I specified with the start command was being replaced in /etc/resolv.conf with some other IP. That seem to be fixed by doing the following:

colima start -d <my-dns-ip>
colima ssh
sudo -i
mkdir /etc/udhcpc/
echo "RESOLV_CONF=\"no\"" > /etc/udhcpc/udhcpc.conf
echo "nameserver <my-dns-ip>" > /etc/resolv.conf

chriscasola avatar Aug 23 '22 19:08 chriscasola

@chriscasola thanks for the information. I will dig a bit more on my end.

abiosoft avatar Aug 23 '22 20:08 abiosoft

Just confirming that the two workarounds in this comment do resolve this issue for me.

chriscasola avatar Aug 30 '22 15:08 chriscasola

I have the same issue on Apple M1, and the problem is not DNS. Even trying to reach the IP directly works from my host and not from the VM.

lracicot avatar Dec 31 '22 03:12 lracicot