colima icon indicating copy to clipboard operation
colima copied to clipboard

Intermittent failure to release port bindings

Open rfay opened this issue 3 years ago • 15 comments

Hi, and thanks for this project!

In ddev's full test run for macOS/Colima I see intermittent failures caused apparently by a failure to release port bindings. I haven't experienced this in a local non-github-actions environment, but since most runs are successful, I think it's most likely an issue with Colima/Lima, but don't know how to chase it.

See https://github.com/drud/ddev/runs/5234475567?check_suite_focus=true and search the test run for FAIL:. The first one happens in TestRouterConfigOverride, and you get "Unable to listen on required ports, port 443 is already in use". It seems that port 443 has not been released between tests (and this happens with other ports on other runs).

I don't expect that this has anything to do with the tests, as nothing like this happens in the many other test environments, including docker desktop on macOS, linux, windows, and WSL2.

Any thoughts on how to study this problem and narrow down the scope would be much appreciated. I can probably use tmate and have it stop on failure, but with the current test structure my bet is that the issue might be resolved before I get there. Maybe not. But it's awkward and hard because it's probably one of every 3 test runs, and the test runs take almost 2 hours.

rfay avatar Feb 18 '22 15:02 rfay

This is most likely due to the possible 5 second wait before port forwarding is established (or terminated), a limitation in Lima at the moment.

See https://github.com/abiosoft/colima/issues/71#issuecomment-979516106.

abiosoft avatar Feb 18 '22 15:02 abiosoft

Hi! Not sure if helpful, but sharing in case it is.

I've just experienced a connection refused in my local colima, running on macOS 11.6.2 with colima 0.3.2. My containers were up and running. A curl from within the container succeeded, but a curl from the host resulted in a connection refused. colima status showed colima was running. Restarting the colima vm with colima stop && colima start resolved the issue. Let me know if I can provide any logs (in that case please let me know where to find them) or any more info that may be of interest.

colima version 0.3.2
git commit: 272db4732b90390232ed9bdba955877f46a50552

runtime: docker
arch: x86_64
client: v20.10.12
server: v20.10.11

flavianmissi avatar Feb 21 '22 12:02 flavianmissi

I see that regularly in tests also, didn't mention it above. When the tests don't fail due to ports always being bound, they sometimes fail to connect, and I guess it's the same delay @abiosoft mentioned. Ugly though. Not sure of a good workaround. ddev projects go up and down perhaps hundreds of times in a test run, so I could add a sleep 5 after every single app.Start(), but that's sure not pretty.

rfay avatar Feb 21 '22 13:02 rfay

The plan is have vmnet networking bundled with colima so each vm gets an accessible static IP address (regardless of port-forwarding to localhost). The main downside is that it will require root access for initial setup.

abiosoft avatar Feb 25 '22 06:02 abiosoft

It would be awesome to have a reliable port binding situation. For me, root for install is fine. I definitely know the drawbacks.

rfay avatar Feb 25 '22 22:02 rfay

Kindly install the current development version with brew install --HEAD colima and try again. You can retrieve the IP address via colima ls or colima ls --json if you need structure output for further processing.

abiosoft avatar Mar 19 '22 15:03 abiosoft

Should this be reopened, as it seems to remain an issue? Thanks for all the great work.

rfay avatar Mar 23 '22 13:03 rfay

Yeah, that's fine.

abiosoft avatar Mar 23 '22 13:03 abiosoft

I have to restart DDEV's Colima tests sometimes 3-4-5 times just because of this issue. I understand why it doesn't much affect an ordinary user, but wow, I wish there was a way around this. Even a workaround.

rfay avatar May 20 '22 21:05 rfay

@rfay considering that gvproxy is now being utilised in Colima, Docker/Containerd events can be monitored directly and forwarded via gvproxy instead.

It will be resolved in Lima eventually and the preference is to delegate to Lima as much as possible. But nonetheless, I will explore and see if it does not require too much effort.

abiosoft avatar May 21 '22 05:05 abiosoft

Much appreciated!

rfay avatar May 21 '22 11:05 rfay

Any thoughts about workarounds I could use to mitigate this problem?

rfay avatar Jun 23 '22 15:06 rfay

@rfay I explored this https://github.com/abiosoft/colima/issues/189#issuecomment-1133537104 and it is relatively straightforward to implement for Docker (and Containerd) but not for Kubernetes.

Does your workflow involve Kubernetes or it's primarily Docker? Nonetheless I will prioritise this. It can always be improved and will be an optional feature.

abiosoft avatar Jun 23 '22 17:06 abiosoft

I guess I didn't understand what you meant in https://github.com/abiosoft/colima/issues/189#issuecomment-1133537104 (and still don't). What action would I take to solve this problem of port bindings not being released? Run some kind of external process doing listening and taking some kind of action?

This is only for docker.

Thanks so much for looking at it.

rfay avatar Jun 23 '22 18:06 rfay

I guess I didn't understand what you meant in https://github.com/abiosoft/colima/issues/189#issuecomment-1133537104 (and still don't). What action would I take to solve this problem of port bindings not being released? Run some kind of external process doing listening and taking some kind of action?

@rfay no action is required on your end. I will notify when it is ready for testing.

abiosoft avatar Jun 24 '22 19:06 abiosoft

I've run into this issue after updating my machine today to 13.2.1 (22D68). I'm looking for a workaround using colima 0.5.2.

kaihendry avatar Feb 15 '23 06:02 kaihendry

Pretty sure this is obsolete so closing.

rfay avatar Jan 04 '24 14:01 rfay