unit icon indicating copy to clipboard operation
unit copied to clipboard

Container: Waiting for control socket to be removed

Open kkzetAM opened this issue 3 years ago • 10 comments

Hi, I still have problem with Waiting for control socket to be removed in 1.26, 1.26.1 and 1.27 with python3.9, this was the same: https://github.com/nginx/unit/issues/610

FROM nginx/unit:1.26.1-python3.9

This problem occurs randomly.

/usr/local/bin/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, launching Unit daemon to perform initial configuration...
2022/06/23 11:13:41 [warn] 11#11 Unit is running unprivileged, then it cannot use arbitrary user and group.
2022/06/23 11:13:41 [info] 11#11 unit 1.26.1 started
2022/06/23 11:13:41 [info] 13#13 discovery started
2022/06/23 11:13:41 [notice] 13#13 module: python 3.9.9 "/usr/lib/unit/modules/python3.unit.so"
2022/06/23 11:13:41 [info] 12#12 controller started
2022/06/23 11:13:41 [notice] 12#12 process 13 exited with code 0
2022/06/23 11:13:41 [info] 16#16 router started
2022/06/23 11:13:41 [info] 16#16 OpenSSL 1.1.1k  25 Mar 2021, 101010bf
{
	"certificates": {},
	"config": {
		"listeners": {},
		"applications": {}
	}
}
/usr/local/bin/docker-entrypoint.sh: Looking for certificate bundles in /docker-entrypoint.d/...
/usr/local/bin/docker-entrypoint.sh: Looking for configuration snippets in /docker-entrypoint.d/...
/usr/local/bin/docker-entrypoint.sh: Applying configuration /docker-entrypoint.d/config.json
2022/06/23 11:13:41 [info] 20#20 "fastapi" prototype started
2022/06/23 11:13:41 [info] 21#21 "fastapi" application started
/usr/local/bin/docker-entrypoint.sh: OK: HTTP response status code is '200'
{
	"success": "Reconfiguration done."
}

/usr/local/bin/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/...
/usr/local/bin/docker-entrypoint.sh: Stopping Unit daemon after initial configuration...
/usr/local/bin/docker-entrypoint.sh: Waiting for control socket to be removed...
2022/06/23 11:13:43 [notice] 12#12 process 15 exited with code 0
2022/06/23 11:13:43 [notice] 12#12 process 16 exited with code 0
2022/06/23 11:13:43 [warn] 21#21 [unit] #8: active request on ctx quit
2022/06/23 11:13:43 [warn] 21#21 [unit] #8: active request on ctx free
2022/06/23 11:13:43 [warn] 21#21 [unit] sendmsg(19, 16) failed: Broken pipe (32)
/usr/local/bin/docker-entrypoint.sh: Waiting for control socket to be removed...
/usr/local/bin/docker-entrypoint.sh: Waiting for control socket to be removed...
/usr/local/bin/docker-entrypoint.sh: Waiting for control socket to be removed...
/usr/local/bin/docker-entrypoint.sh: Waiting for control socket to be removed...
/usr/local/bin/docker-entrypoint.sh: Waiting for control socket to be removed...
/usr/local/bin/docker-entrypoint.sh: Waiting for control socket to be removed...
/usr/local/bin/docker-entrypoint.sh: Waiting for control socket to be removed...
/usr/local/bin/docker-entrypoint.sh: Waiting for control socket to be removed...
/usr/local/bin/docker-entrypoint.sh: Waiting for control socket to be removed...

kkzetAM avatar Jun 23 '22 11:06 kkzetAM

Fix is already developed and will be part of the next release. Thanks for reporting it to us.

tippexs avatar Jun 23 '22 15:06 tippexs

The same issue in 1.27.0-php8.1

dentelis avatar Aug 09 '22 11:08 dentelis

@tippexs I'm going to build the container with the latest unit version, but I could not find the commit that solved this issue in the commit history. Would you please mention the commit that solved this issue?

alishir avatar Aug 15 '22 07:08 alishir

same issue in 1.28 with php 7.4 (docker, build from https://github.com/nginx/unit/tree/master/pkg/docker/Dockerfile.php8.1, but php version of source image changed to 7.4), and default docker images (all from nginx/unit:1.26.1-php8.1 to nginx/unit:1.28.0-php8.1) when "processes" count in unit configuration for php application is greater then 1 (for example 8). I have to mention that this problem makes unit almost useless for me because first reconfiguration is made on start from json file in /docker-entrypoint.d/... so container starts in invalid hanged state. Maybe someone have some ideas of a workaround till any fix for this problem will be released?

lone-cat avatar Sep 14 '22 13:09 lone-cat

@lone-cat let me share the updated docker-entrypoint.sh with you in a minute. Happy for you to test. Sorry the script changes did not get merged. Sorry for still having trouble with it.

Will be sharing the Script update for testing in a couple of hours.

tippexs avatar Sep 15 '22 07:09 tippexs

@lone-cat the fix is ready for you to test

Can you test this fix in the docker-entrypoint.sh script?

59,70c59,61
<             for i in {1..5}; do
<               if [[ -S /var/run/control.unit.sock ]]
<               then
<                 echo "$0 Waiting for control socket to be removed..."
<                 /bin/sleep 1.0
<               else
<                 break
<               fi
<             done
<             if [ -S /var/run/control.unit.sock ]; then
<              kill -SIGTERM `/bin/cat /var/run/unit.pid` && rm -f /var/run/control.unit.sock
<             fi
---
>
>             while [ -S /var/run/control.unit.sock ]; do echo "$0: Waiting for control socket to be removed..."; /bin/sleep 0.1; done
>

The full script can be found here: https://gist.github.com/tippexs/125658d0c90b03f8fc5deac19a6322eb

Please clone the script for our github repo, apply the fix and create a new Docker-Image copying the new file back into your image.

Make sure you make the script executable before copying it in the container.

COPY docker-entrypoint.sh /usr/local/bin/

We are just about updating the official Docker-Images if this fix will work for you as well. Please let me know and sorry again for the late response on this!!!

tippexs avatar Sep 15 '22 16:09 tippexs

@tippexs Thank you for this workaround, it works just as it can be expected from shell script - if reconfiguration "hangs" it prints "Waiting for control socket to be removed..." for 4 times and then unit gets killed and starts again normally. Tested on my build v1.28 with php 7.4, but i bet it will work in all other versions. This solution covers all my needs for now - i use only static preconfiguration from json file and no changes are made "on the fly". But while i was searching for solutions in other issues i found your messge https://github.com/nginx/unit/issues/570#issuecomment-1088576450 and i'm curious - will this problem appear for someone who reconfigures unit "on the fly"? As you mentioned the real problem is in processing TERM signal in unit. Will this someday be fixed in unit code, but not as such workaround? Just wondering... And really, thank you for this solution, i didn't even expect to get so fast and working answer =) THANKS!!!!

lone-cat avatar Sep 15 '22 22:09 lone-cat

Hi @lone-cat - Thanks for testing the fix! Appreciate it!

No - This is a "docker initial start issue" only and will not apply if you reconfigure Unit using the API.

The main issue is not due to the SIGTERM. The handling of such signals are just working fine on a VM. Even in Docker this issue happens just under some special circumstances. I will move the fix into mainline and will let you all know once the images are build.

tippexs avatar Sep 16 '22 10:09 tippexs

@tippexs ok, got it =) thanks for explanation.

lone-cat avatar Sep 16 '22 11:09 lone-cat

Yeap, I have still this issue with nginx unit 1.28 and python 3.10. Should I change script docker-entrypoint.sh?

kkzetAM avatar Sep 23 '22 14:09 kkzetAM

@tippexs Any update on this issue?

Dawoodkhorsandi avatar Nov 30 '22 06:11 Dawoodkhorsandi