ignite icon indicating copy to clipboard operation
ignite copied to clipboard

VMs start and show running, but are unreachable and can't attach to them either

Open sblitzken opened this issue 5 years ago • 5 comments

Hi, I have firecracker v0.21.1 running on Ubuntu 20.04 LTS on a VMware Fusion machine (yes I know this is kind of weird, but I don't have any extra hardware laying around). Ignite version info:

Ignite version: version.Info{Major:"0", Minor:"7", GitVersion:"v0.7.0", GitCommit:"0e3459476130fa360fcd058d4cf8a8ef7fdb68a0", GitTreeState:"clean", BuildDate:"2020-06-02T23:22:10Z", GoVersion:"go1.14.2", Compiler:"gc", Platform:"linux/amd64", SandboxImage:version.Image{Name:"weaveworks/ignite", Tag:"v0.7.0", Delimeter:":"}, KernelImage:version.Image{Name:"weaveworks/ignite-kernel", Tag:"4.19.125", Delimeter:":"}} Firecracker version: v0.21.1 Runtime: containerd

CNI is v0.8.5.

When I run a VM:

-> sudo ignite run weaveworks/ignite-ubuntu --debug --cpus 2 --memory 1024 --ssh --name smoke

INFO[0001] Created VM with ID "8b5053e41d7a5cbf" and name "smoke" INFO[0002] Networking is handled by "cni" INFO[0002] Started Firecracker VM "8b5053e41d7a5cbf" in a container with ID "ignite-8b5053e41d7a5cbf" INFO[0003] Waiting for the ssh daemon within the VM to start... INFO[0023] Waiting for the ssh daemon within the VM to start... INFO[0043] Waiting for the ssh daemon within the VM to start... FATA[0062] Tried connecting to SSH but timed out dial tcp 10.61.0.4:22: i/o timeout

But then:

-> sudo ignite ps

VM ID IMAGE KERNEL SIZE CPUS MEMORY CREATED STATUS IPS PORTS NAME 8b5053e41d7a5cbf weaveworks/ignite-ubuntu:latest weaveworks/ignite-kernel:4.19.125 4.0 GB 2 1024 B 5m59s ago Up 5m59s 10.61.0.4 smoke 9c08f6dcfd8dedce weaveworks/ignite-ubuntu:latest weaveworks/ignite-kernel:4.19.125 4.0 GB 2 1024 B 19m ago Up 19m 10.61.0.3 onemogain

You can see when I tried to run earlier and had the same issue. I also can't stop them:

-> sudo ignite stop smoke INFO[0000] Removing the container with ID "ignite-8b5053e41d7a5cbf" from the "cni" network FATA[0000] failed to Statfs "/proc/9786/ns/net": no such file or directory

Nor can I force kill:

-> sudo ignite vm kill smoke INFO[0000] Removing the container with ID "ignite-8b5053e41d7a5cbf" from the "cni" network FATA[0000] failed to Statfs "/proc/9786/ns/net": no such file or directory

I can't attach to them either. I have no idea where to go from here to troubleshoot.

When I first installed firecracker, I could start a VM with the example hello-vmlinux.bin and hello-rootfs.ext4, but now I went back and checked and it's dumping core.

sblitzken avatar Jun 09 '20 21:06 sblitzken

We've now identified the issue, the ignite-spawn container has multiple interfaces used for bridging sharing the same MAC address. The cause of this is still unkown, a quick fix is to (unsuccessfully) ping the previous IP address given a VM by CNI and after that ping the current IP address. This will open communications both ways (now you can ping the VM and the VM can access the outside world).

cc @stealthybox, @chanwit

twelho avatar Jul 03 '20 15:07 twelho

See #634 for a possible (atm very hacky) fix

luxas avatar Jul 03 '20 16:07 luxas

I'm having this same problem on ignite 0.8.0 on Archlinux. Is there any work around that I could try? Not sure how to debug this. Is there a way to get logs from the container?

rdezavalia avatar Feb 11 '21 12:02 rdezavalia

@rdezavalia are you able to attach to your ignite VM's?

ignite attach --help

You can get a shell through that tty and debug inside out from there.

stealthybox avatar Mar 16 '21 20:03 stealthybox

If you can't attach, there's likely something wrong with the container runtime (containerd/docker) or the firecracker VM within the ignite sandbox

stealthybox avatar Mar 16 '21 20:03 stealthybox