colima icon indicating copy to clipboard operation
colima copied to clipboard

amd64/docker: start fails when vm is restarted after docker installation

Open dimaqq opened this issue 1 year ago • 2 comments

Description

Fails when using docker/moby runtime:

🦐/c/hexanator (main)> colima start --profile amd64 --vm-type=vz --vz-rosetta --arch amd64 --cpu 8 --memory 8 --disk 50 --network-address --mount /code:w -e
INFO[0000] editing in vim from $EDITOR environment variable
INFO[0025] starting colima [profile=amd64]
INFO[0025] runtime: docker
INFO[0026] creating and starting ...                     context=vm
INFO[0081] provisioning ...                              context=docker
INFO[0082] starting ...                                  context=docker
> [hostagent] Shutting down the host agent
> "[hostagent] failed to exit SSH master" error="failed to execute `ssh -O exit -p 49514 127.0.0.1`, out=\"Control socket connect(/Users/dima/.colima/_lima/colima-amd64/ssh.sock): No such file or directory\\r\\n\": exit status 255"
> [hostagent] Shutting down QEMU with ACPI
> "[hostagent] failed to open the QMP socket \"/Users/dima/.colima/_lima/colima-amd64/qmp.sock\", forcibly killing QEMU" error="dial unix /Users/dima/.colima/_lima/colima-amd64/qmp.sock: connect: connection refused"
> [hostagent] QEMU has already exited
> exiting, status={Running:false Degraded:false Exiting:true Errors:[] SSHLocalPort:0} (hint: see "/Users/dima/.colima/_lima/colima-amd64/ha.stderr.log")
FATA[0097] error starting docker: error at 'starting': exit status 1
⏎

The last line in serial console log is GRUB_FORCE_PARTUUID set, attempting initrdless boot.

(/code is my case-sensitive volume for source code that I intend to use)

At the same time:

  • works for amd64/containerd
  • works for arm64/docker

Version

🦐/c/hexanator (main)> colima version && limactl --version && qemu-img --version
colima version 0.6.9
git commit: c3a31ed05f5fab8b2cdbae835198e8fb1717fd0f
limactl version 0.22.0
qemu-img version 9.0.1
Copyright (c) 2003-2024 Fabrice Bellard and the QEMU Project developers

Operating System

  • [ ] macOS Intel <= 13 (Ventura)
  • [ ] macOS Intel >= 14 (Sonoma)
  • [ ] Apple Silicon <= 13 (Ventura)
  • [X] Apple Silicon >= 14 (Sonoma)
  • [ ] Linux

Output of colima status

No response

Reproduction Steps

Expected behaviour

No response

Additional context

No response

dimaqq avatar Jul 01 '24 04:07 dimaqq

Let me know how I can help with this, the issue is trivially reproducible... I think I saw that the SSH port is 0 in the logs?

dimaqq avatar Jul 01 '24 05:07 dimaqq

@dimaqq I don't think this is your issue, but mine was that I was requesting too much memory.

Specifically I was passing --memory 8096 (copied from a previous minikube command) to colima. I think colima treats this as 8k GB of memory which very consistently would cause what looks like the same error. Changing to --memory 8 fixed the problem.

> [hostagent] Shutting down the host agent
> "[hostagent] failed to exit SSH master" error="failed to execute `ssh -O exit -p 59831 127.0.0.1`, out=\"Control socket connect(/Users/user/.colima/_lima/colima/ssh.sock): No such file or directory\\r\\n\": exit status 255"
> [hostagent] Shutting down QEMU with ACPI
> "[hostagent] failed to open the QMP socket \"/Users/user/.colima/_lima/colima/qmp.sock\", forcibly killing QEMU" error="dial unix /Users/user/.colima/_lima/colima/qmp.sock: connect: connection refused"
> [hostagent] QEMU has already exited
> exiting, status={Running:false Degraded:false Exiting:true Errors:[] SSHLocalPort:0} (hint: see "/Users/user/.colima/_lima/colima/ha.stderr.log")

TL;DR: Not a great error message, check if any of your parameters are messed up.

CarterFendley avatar Sep 11 '24 15:09 CarterFendley