colima icon indicating copy to clipboard operation
colima copied to clipboard

Lima holding port open after colima is stopped

Open rfay opened this issue 1 year ago • 7 comments

Description

After stopping colima to debug port failures, I still see the port being held open by Lima:

$ colima stop fresh
INFO[0000] stopping colima [profile=fresh]
INFO[0000] stopping ...                                  context=docker
INFO[0001] stopping ...                                  context=vm
INFO[0005] done

$ sudo lsof -i :8025 -sTCP:LISTEN
Password:
COMMAND   PID USER   FD   TYPE             DEVICE SIZE/OFF NODE NAME
ssh     53130 rfay   21u  IPv4 0x8d534c11986d2b73      0t0  TCP localhost:ca-audit-da (LISTEN)
rfay@rfay-tag1-m1:~/workspace/platform-wpbedrock/web$ ps -fp 53130
  UID   PID  PPID   C STIME   TTY           TIME CMD
  501 53130     1   0 Sun03PM ??         0:54.36 ssh: /Users/rfay/.lima/colima-fresh/ssh.sock [mux]
$ limactl list
NAME            STATUS     SSH            CPUS    MEMORY    DISK      DIR
colima          Stopped    127.0.0.1:0    4       6GiB      100GiB    ~/.lima/colima
colima-fresh    Stopped    127.0.0.1:0    4       6GiB      100GiB    ~/.lima/colima-fresh
$ limactl stop colima-fresh
INFO[0000] Sending SIGINT to hostagent process 8201
INFO[0000] Waiting for the host agent and the driver processes to shut down
INFO[0000] [hostagent] Received SIGINT, shutting down the host agent
INFO[0000] [hostagent] Shutting down the host agent
INFO[0000] [hostagent] Stopping forwarding "/var/run/docker.sock" (guest) to "/Users/rfay/.colima/fresh/docker.sock" (host)
INFO[0000] [hostagent] Stopping forwarding "/run/lima-guestagent.sock" (guest) to "/Users/rfay/.lima/colima-fresh/ga.sock" (host)
INFO[0000] [hostagent] Unmounting "/Users/rfay"
INFO[0000] [hostagent] Unmounting "/tmp/colima-fresh"
INFO[0000] [hostagent] Shutting down QEMU with ACPI
INFO[0000] [hostagent] Sending QMP system_powerdown command
INFO[0005] [hostagent] QEMU has exited

Even after that this is still happening.

$ sudo lsof -i :8025 -sTCP:LISTEN
Password:
COMMAND   PID USER   FD   TYPE             DEVICE SIZE/OFF NODE NAME
ssh     53130 rfay   21u  IPv4 0x8d534c11986d2b73      0t0  TCP localhost:ca-audit-da (LISTEN)

Version

Colima Version: 0.5.4 Lima Version: 0.15.0 Qemu Version: 7.2.0

Operating System

  • [ ] macOS Intel <= 12 (Monterrey)
  • [ ] macOS Intel >= 13 (Ventura)
  • [ ] macOS M1 <= 12 (Monterrey)
  • [X] macOS M1 >= 13 (Ventura)
  • [ ] Linux

Output of colima status

$ colima status fresh
FATA[0000] colima [profile=fresh] is not running

After restarting:

$ colima status fresh
INFO[0000] colima [profile=fresh] is running using QEMU
INFO[0000] arch: aarch64
INFO[0000] runtime: docker
INFO[0000] mountType: sshfs
INFO[0000] socket: unix:///Users/rfay/.colima/fresh/docker.sock

Reproduction Steps

I was running some DDEV tests and suddenly saw a port-held-open problem that I couldn't resolve by stopping all containers... then couldn't fix it by stopping Colima either.

Expected behaviour

When colima is stopped (or containers are stopped) no ports should be held open.

Additional context

Thanks as always!

I recognize that this may be a lima bug.

Killing the process did get rid of the port hold.

$ kill 53130
$ ps -fp 53130
  UID   PID  PPID   C STIME   TTY           TIME CMD
$ sudo lsof -i :8025 -sTCP:LISTEN

rfay avatar Mar 06 '23 23:03 rfay

I would expect the port to be held for maximum of 5 seconds after shutdown, something seems off.

Maybe the Lima host-agent (that handles the port forwarding) is still running somehow. Are you doing some sort of stress-tests? Maybe there are large number of connections and the ssh process is still taking time to get rid of them.

Can you share the command (or the images you are running)?

abiosoft avatar Mar 09 '23 06:03 abiosoft

Thanks. I unfortunately didn't know how I got into this situation, or whether there was something I'd done that was unusual. I just did totally normal things, ddev poweroff, was trying ddev start and failing due to port conflicts, and started debugging and found that the port was held by ssh,

rfay@rfay-tag1-m1:~/workspace/platform-wpbedrock/web$ ps -fp 53130
  UID   PID  PPID   C STIME   TTY           TIME CMD
  501 53130     1   0 Sun03PM ??         0:54.36 ssh: /Users/rfay/.lima/colima-fresh/ssh.sock [mux]

Even though lima was supposed to be stopped and colima too.. I can't guess how this got left running, but since I'd never seen anything like this before I figured I'd better report the situation.

rfay avatar Mar 09 '23 13:03 rfay

I haven't seen this again, and don't know what happened, and imagine it's a time- or shutdown-sensitive bug in lima. Closing for now.

rfay avatar Mar 15 '23 23:03 rfay

@Firesphere is reporting this currently.

rfay avatar Aug 15 '23 02:08 rfay

I am now experiencing this exact same issue, with the exact same details (except for the username).

prj, project etc. in the output are placeholders for my actual project names. My username is really me on my own laptop :)

Please note, the order of the output is not entirely chronological, as I'm scrolling up through my terminal, trying to grab everything even slightly relevant that I tried. These are, however, all from before I restarted Colima, which solves the problem.

Before retrying to start the container, I ran a ddev poweroff

me@Ichthyocentaurs ~ % limactl --version
limactl version 0.17.2
me@Ichthyocentaurs ~ % colima --version
colima version 0.5.5
me@Ichthyocentaurs ~ % uname -a
Darwin Ichthyocentaurs.hub 22.5.0 Darwin Kernel Version 22.5.0: Thu Jun  8 22:22:23 PDT 2023; root:xnu-8796.121.3~7/RELEASE_ARM64_T6020 arm64

Colima ssh config:

Host lima-colima
  StrictHostKeyChecking no
  UserKnownHostsFile /dev/null
  NoHostAuthenticationForLocalhost yes
  GSSAPIAuthentication no
  PreferredAuthentications publickey
  Compression no
  BatchMode yes
  IdentitiesOnly yes
  Ciphers "^[email protected],[email protected]"
  User me
  ControlMaster auto
  ControlPersist 5m
  Hostname 127.0.0.1
  Port 51664

Attempting to find out what is listening on port 8025:

me@Ichthyocentaurs ~ % lsof -n -i4TCP:8025 | grep LISTEN
ssh     32913 me   26u  IPv4 0xb896263074226ce1      0t0  TCP 127.0.0.1:ca-audit-da (LISTEN)
me@Ichthyocentaurs ~ % ps -p 32913                      
  PID TTY           TIME CMD
32913 ??         4:10.30 ssh: /Users/me/.lima/colima/ssh.sock [mux]

Error from DDEV:

me@Ichthyocentaurs prj % ddev start
Starting prj... 
 Network ddev-prj_default  Created 
 Container ddev-prj-db  Created 
 Container ddev-prj-web  Created 
 Container ddev-prj-web  Started 
 Container ddev-prj-db  Started 
You have Mutagen enabled and your 'silverstripe' project type doesn't have `upload_dirs` set. 
For faster startup and less disk usage, set upload_dirs to where your user-generated files are stored. 
If this is intended you can disable this warning with `ddev config --disable-upload-dirs-warning`. 
Starting mutagen sync process... This can take some time. 
..Mutagen sync flush completed in 5s.
For details on sync status 'ddev mutagen st prj -l' 
Failed to start prj: Unable to listen on required ports, port 8025 is already in use,
Troubleshooting suggestions at https://ddev.readthedocs.io/en/stable/users/basics/troubleshooting/#unable-listen 
simonerkelens@Ichthyocentaurs prj % docker ps
CONTAINER ID   IMAGE                                                                COMMAND                  CREATED          STATUS                    PORTS                                                         NAMES
6dea2fc54fe6   ddev/ddev-dbserver-mariadb-10.4:v1.22.0-prj-built                 "/docker-entrypoint.…"   18 seconds ago   Up 15 seconds (healthy)   127.0.0.1:49179->3306/tcp                                     ddev-prj-db
09cdf6acfcb6   ddev/ddev-webserver:20230803_php_serialize_precision-prj-built    "/pre-start.sh"          18 seconds ago   Up 16 seconds (healthy)   8025/tcp, 127.0.0.1:49181->80/tcp, 127.0.0.1:49180->443/tcp   ddev-prj-web
7795ba0d8bbf   ddev/ddev-ssh-agent:v1.22.0-built                                    "/entry.sh ssh-agent"    6 minutes ago    Up 6 minutes (healthy)                                                                  ddev-ssh-agent
me@Ichthyocentaurs ~ % netstat -anv | grep 8025
tcp4       0      0  127.0.0.1.8025         *.*                    LISTEN       131072  131072  32913      0 00100 00000006 0000000000fc531e 00000000 00000900      1      0 000001

Firesphere avatar Aug 15 '23 02:08 Firesphere

A short-term solution for this problem, is restarting (Co)lima

Firesphere avatar Aug 15 '23 23:08 Firesphere

I can confirm that this has happened to me occasionally and restarting colima frees the 8025 port.

pcambra avatar Oct 23 '23 10:10 pcambra

This definitely remains a problem; on Colima/Lima I've added a sleep after stopping a container to let the port be removed. But this is a Lima problem, not Colima. I don't think it happens on Rancher Desktop though.

rfay avatar Aug 07 '24 14:08 rfay

Closing in favor of

  • https://github.com/lima-vm/lima/issues/2536

rfay avatar Aug 07 '24 14:08 rfay