concurrently
concurrently copied to clipboard
Weird hanging bug with --kill-others
Hi,
I've found a superweird bug that I'd like some help to figure out what to do with.
When running concurrently in my node-alpine docker container, PIDs are (consistently) one off.
To test this, I modified concurrently to print the PID first when spawning a child process, and also when trying to kill them in killOtherProcesses
. http-server is constantly one higher than the saved PID.
Example output:
bash-4.3# ps aux
PID USER TIME COMMAND
1 root 0:00 bash
1293 root 0:00 ps aux
bash-4.3# concurrently --kill-others "http-server ." "npm --version" &
[1] 1294
1300
1307
[0] Starting up http-server, serving .
[0] Available on:
[0] http://127.0.0.1:8080
[0] http://172.17.0.3:8080
[0] Hit CTRL-C to stop the server
[1] 4.4.4
[1] npm --version exited with code 0
--> Sending SIGTERM to other processes..
1300
bash-4.3# ps
PID USER TIME COMMAND
1 root 0:00 bash
1294 root 0:00 node /usr/bin/concurrently --kill-others http-server . npm --version
1301 root 0:00 node /usr/bin/http-server .
1315 root 0:00 ps
So it's trying to kill http-server on 1300, but http-server is actually on 1301. If I change the code to kill child.pid+1
instead, all is fine...
Any ideas? I'm stumped for ideas where to look further.
Oh, just when I've filed an issue I find out what the problem is, as always.
Apparently alpine uses busybox as /bin/sh
, and spawn-command
always uses /bin/sh
to launch commands, and busybox shell forks all commands.
Solution? I'm not sure. Perhaps spawn-command
shouldn't wrap in /bin/sh
? Or perhaps I should just replace busybox shell?
So just when I've made an issue I find out what the problem is, as always.
alpine uses busybox ash for /bin/sh
, which is the shell that spawn-command
will always use.
Busybox ash will fork any commands given by -c
, so the pid changes.
Solution?
Replace /bin/sh
with something else, e.g. bash
Or change spawn-command
to not spawn inside a shell
Not sure which would be the best idea.
Not a concurrently issue, but may be related to other issues here
Great that you found a solution, @Maistho. But if there's something feasible we could do on our side, it would be great to let us know.
By the way, you could try concurrently v3.2 or older; they used another dependency to spawn commands: spawn-default-shell
Hi,
I am seeing the same behavior when running in a alpine docker container.
The same npm script, same node_modules directory, just mapped into a ubuntu docker container, will kill the other process.
@cuipengfei yup, you'll need to use bash instead of the builtin shell in alpine.
I have a docker image that uses bash instead, kadadev/node-small, that you might use. If you want to run your own image you can take a look at my dockerfile to see how to switch to bash https://github.com/KadaDev/node-small/blob/master/Dockerfile
@Maistho linking /bin/sh
to /bin/bash
doesn't appear to work. It still hangs at "Sending SIGTERM to other processes". As far as I can tell, /bin/ash
doesn't fork any more as you describe.
@gustavohenke should this be reopened?
Ideas on how to fix this in concurrently?
Untested workaround: concurrently 'exec cmd1' 'exec cmd2'
. IIUC, this result in /bin/sh -c 'exec cmd1'
which should make any posix-ish shell replace itself rather than forking?
BTW I'm confused why OP's shell exits after forking. If the shell stuck around, concurrently should be able to kill both it and its children (it uses https://github.com/pkrumins/node-tree-kill).
But the ps
output proves it forked, then exited!
I can reproduce such behavior by backgrounding: concurrently 'sleep 100 &'
, results in sleep being orphaned and reparented to pid 1, but I don't think people use &
under concurrently :smile:
Untested idea: what if concurrently put each command in its own process group, then sent signal to whole groups rather than specific pids it saved? This is roughly how job control works in shells, and how Ctrl+C is delivered reliably. (Of course a this won't catch a process deliberately "daemonizing" itself, but that takes more than a simple fork by a shell.)
FWIW, concurrently --kill-others 'sleep 100 &' 'sleep 2'
reproduces the hanging.
(normal linux with bash, Fedora 30)
I was bit by this, and it took a while to narrow down on concurrently
and then to notice that the behaviour was different in Docker than in desktop.
In my case (details below), it was about a piping command making concurrently -k
not kill the process, but only on Alpine.
"start": "firebase emulators:start --project=demo-1 | $(filter-emulators)",
concurrently
is such a nice and reliable workhorse; it would be nice if this can somehow be addressed.
Steps:
You need: Node 14 or 16; npm 7.7+, Docker
$ git clone https://github.com/akauppi/firebase-jest-testing.git
$ cd firebase-jest-testing
$ git checkout next
$ npm install
Works in desktop (anything with bash and grep):
$ npm test # should eventually exit to command prompt
To try with Alpine, let's build a Docker image.
$ pushd ci/firebase-ci-builder.sub
$ ./build
$ popd
..and run it
$ docker run -it --rm -v $(pwd):/workspace -w /workspace firebase-ci-builder:9.12.1-node16-npm7 /bin/bash
# npm test
This gets stuck, after the tests.
Work-around:
Have separate npm test
(does filtering) and npm run ci
(doesn't), so that I can run CI within Alpine.
If you get time outs, comment out the testTimeout
line in sample/jest.config.default.js
. Docker seems to need a bit more time, at least on my machine.
Still reproducible in 7.6.0. I noticed that referencing another npm script (i.e. "npm:dev") causes the hanging, but writing the script directly seems to work fine.