Wrong output order when specify `all` option
Reproduction step
$ cat noop-132.mjs
#!/usr/bin/env node
import process from 'node:process';
process.stdout.write('1');
process.stderr.write('3');
process.stdout.write('2');
$ cat test.mjs
import {execa} from 'execa';
const { all } = await execa('./noop-132.mjs', {all: true });
console.log(all);
$ node test.mjs
123
Expect behavior
node test.mjs should output 132 as the README says:
Get interleaved output from stdout and stderr similar to what is printed on the terminal. (Async only)
$ ./noop-132.mjs
132
Actual behavior
node test.mjs output 123
Environment Info
$ node -v
v18.2.0
$ npm ls execa
[email protected] /Users/feng/test
└── [email protected]
Note that process.stdout.write is not guaranteed to be synchronous.
See Node.js docs: https://nodejs.org/api/process.html#a-note-on-process-io
@sindresorhus Thanks for the reply. The original problem is when running a shell script, the output order is not correct:
$ npm ls execa
[email protected] /Users/feng/test
└── [email protected]
$ cat script.sh
echo test1;
echo test2;
echo test3;
echo test4;
echo test5;
dfgyfhgdtfgsdfsdf;
echo test6;
echo test7;
echo test8;
echo test9;
echo test10;
$ cat test.mjs
import {execa} from 'execa';
const { all } = await execa('./script.sh', {all: true });
console.log(all);
$ node test.mjs
test1
test2
test3
test4
test5
test6
test7
test8
test9
test10
./script.sh: line 6: dfgyfhgdtfgsdfsdf: command not found
The correct order would actually be:
test1
test2
test3
test4
test5
./script.sh: line 6: dfgyfhgdtfgsdfsdf: command not found
test6
test7
test8
test9
test10
does it possible achieve get interleaved output using execa as we run ./script.sh?
I can get interleaved output using 2>&1 like below, but I am thinking that if execa could do the same thing without using 2>&1:
$ cat test.js
const { spawn } = require('child_process');
const child = spawn('./script.sh', ['2>&1'], { shell: true });
let mergedOut = '';
child.stdout.on('data', (chunk) => {
process.stdout.write(chunk);
mergedOut += chunk;
});
child.on('close', (_code, _signal) => {
console.log('-'.repeat(30));
console.log(mergedOut);
});
$ node test.js
test1
test2
test3
test4
test5
./script.sh: line 6: dfgyfhgdtfgsdfsdf: command not found
test6
test7
test8
test9
test10
------------------------------
test1
test2
test3
test4
test5
./script.sh: line 6: dfgyfhgdtfgsdfsdf: command not found
test6
test7
test8
test9
test10
I don't have an answer for you. The implementation is quite simple: https://github.com/sindresorhus/execa/blob/071d0a282a14633493e9d8b4e5b686c43da64368/lib/stream.js#L19-L35
Some possible reasons:
- Bug in the
merge-streampackage we use. - Bug in Node.js streams (happens more often than you would think).
- script.sh behaving differently when used in the terminal and when used as a sub process.
script.sh behaving differently when used in the terminal and when used as a sub process.
This Python script shows we can rule that one out. (Uses the same noop-132.mjs file as in the first post.)
import subprocess
task = subprocess.Popen(['./noop-132.mjs'], stdout=subprocess.PIPE, stderr=subprocess.STDOUT, encoding="utf8")
print(task.stdout.read())
❯ python3 tst.py
132
The problem is this: when using the all option, we currently guarantee the order of the stdout writes (on their own), of the stderr writes (on their own), but not the order of the interleaved merge of those two.
In other terms, stdout and stderr are both merged to the same destination (all), but they might do it at different paces.
Implementation-wise, this feature mostly relies on @sindresorhus/merge-streams, so any fix would probably need to be done on that dependency's codebase (which is just one small file).
https://github.com/sindresorhus/execa/blob/fcf648e21b6ccc9ec279fb758318ed7581f02fc6/lib/stream.js#L6-L13
Under the hood, what this is basically doing is:
childProcess.all = new PassThroughStream({objectMode: true});
childProcess.stdout.pipe(childProcess.all, {end: false});
childProcess.stderr.pipe(childProcess.all, {end: false});
You can also check the Node.js code for Readable.pipe(). Apparently, childProcess.all.write() is called as soon as childProcess.stdout|stderr data event is emitted. I am not completely sure but I believe the issue might be that when data is written to childProcess.stdout|stderr, it might not emit that data event right away, for performance reasons (such as back pressure, or trying not to hold the current microtask for too long). If two competing streams (stdout and stderr) are both doing this at different paces, this can create a race condition.
If anyone has some idea on how to solve this at an abstract level, without compromising too much on performance, please share!
Since this is a timing / race condition related issue involving streams, reproducing in CI will be difficult. However, I can reproduce it locally using @F3n67u's first example.
I did some additional research into this. The bottom line is: this behavior is inherited from Node.js and I do not think we cannot change, although users can use some workarounds.
The core of the issue can be reproduced with this sample, which is a simplification of what Execa does under the hood:
// example.js
import process from 'node:process';
process.stdout.write('1');
process.stderr.write('2');
process.stdout.write('3');
// run.js
import {spawn} from 'node:child_process';
const childProcess = spawn('node', ['example.js']);
childProcess.stdout.on('data', value => {
console.log('stdout', value.toString());
})
childProcess.stderr.on('data', value => {
console.log('stderr', value.toString());
})
When calling node run.js, the output is:
stdout 13
stderr 2
The two process.stdout.write() got merged into a single data event by Node.js. This is happening for performance reasons: Node.js buffers the write calls done in the same macrotask.
This is done at the core Node.js level, and there is not much Execa can do at the process spawning level. Also, since is done for performance reasons, it is mostly beneficial, albeit it creates this out-of-order issue.
The workaround is to delay writes to a different macrotask. For example:
// example.js
import process from 'node:process';
import {setTimeout} from 'timers/promises';
process.stdout.write('1');
process.stderr.write('2');
await setTimeout(0);
process.stdout.write('3');
Results in:
stdout 1
stderr 2
stdout 3
The conclusion is:
- If a child process's
stdoutandstderrare both written to in the same macrotask - Then
childProcess.allwill show first thestdoutwrites (in order), then thestderrwrites (in order), or the other way around, but not interleaved
I documented this workaround in #640.