sh icon indicating copy to clipboard operation
sh copied to clipboard

interp: process substitution named pipes seem to be flaky on Darwin

Open mvdan opened this issue 5 years ago • 3 comments

Split from https://github.com/mvdan/sh/issues/513, since this one can be reproduced by @theclapp on his Mac machine.

It doesn't seem to be the same issue that we're seeing with pseudo-TTYs on CI, which we haven't been able to reproduce ourselves. These failed tests run without any sort of TTY.

Some examples:

    --- FAIL: TestRunnerRun/817 (0.01s)
        interp_test.go:2668: wrong output in "cat <(cat <(echo nested))":
            want: "nested\n"
            got:  "file exists\n"

    --- FAIL: TestRunnerRun/817 (0.00s)
        interp_test.go:2668: wrong output in "cat <(cat <(echo nested))":
            want: "nested\n"
            got:  ""

    --- FAIL: TestRunnerRun/817 (0.05s)
        interp_test.go:2667: wrong output in "echo foo bar > >(sed 's/o/e/g')":
            want: "fee bar\n"
            got:  ""

     --- FAIL: TestRunnerRun/814 (0.00s)
        interp_test.go:2684: wrong output in "sed 's/o/e/g' <(echo foo bar)":
            want: "fee bar\n"
            got:  "open /var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/sh-interp-133253605: no such file or directory"
FAIL

None of these are reproducible on Linux with stress testing, apparently. Thoughts:

  • Is our code racy? Right now, we assume that it's OK for the named pipe writer to delete the file when done writing, under the assumption that the reader goroutine has already opened the pipe.
  • Following the above, we assume that writes to the named pipe won't succeed until the reads have happened. This might be a wrong assumption, given the "no such file or directory" error above.
  • On Linux, unix.Mkfifo calls unix.Mknod, which calls the mknod syscall. On Darwin, unix.Mkfifo calls the mkfifo syscall via libc directly, presumably since the mknod syscall requires root privileges.
  • Are Linux and Darwin named pipe semantics any different? E.g. buffering, or deleting the file while a reader is still reading.

mvdan avatar May 30 '20 19:05 mvdan

Another one:

 --- FAIL: TestRunnerRun (0.03s)
    --- FAIL: TestRunnerRun/822 (0.00s)
##[error]        interp_test.go:2701: wrong output in "echo foo bar | tee >(sed 's/o/e/g') >/dev/null":
            want: "fee bar\n"
            got:  "cannot create fifo: file exists\nexit status 1"
FAIL
FAIL	mvdan.cc/sh/v3/interp	0.575s

mvdan avatar Aug 30 '20 15:08 mvdan

Unfortunately, these are still present:

 --- FAIL: TestRunnerRun (0.06s)
    --- FAIL: TestRunnerRun/#919 (0.01s)
        interp_test.go:3229: wrong output in "cat <(cat <(echo nested))":
            want: "nested\n"
            got:  "open /var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/sh-interp-4767e74e48b9ec1c: no such file or directory"
FAIL
FAIL	mvdan.cc/sh/v3/interp	1.494s

So it was indeed wrong of me to close this out. I'll revert.

mvdan avatar Dec 30 '21 18:12 mvdan