msys2-runtime icon indicating copy to clipboard operation
msys2-runtime copied to clipboard

read after select blocks for pipe to external program when running MSYS program in UCRT64

Open purplesyringa opened this issue 1 year ago • 8 comments

I'm rather inexperienced with MSYS2, so I read up on the website that UCRT64 is the recommended default environment and started using UCRT64 only. I needed Python, so I did pacman -S python3, which pulled the MSYS version of the package, which I then used in UCRT64 environment. I'm not sure if running MSYS packages in UCRT64 is even supported, so please do tell me if I was lucky it didn't break earlier.

This worked just fine until the following bug surfaced.

Python has a neat subprocess.Popen.communicate facility for running a subprocess with (possibly long) input and retrieving output without deadlocking on stdio. How it is implemented depends on the OS, but on POSIX it creates anonymous pipes for both stdin and stdout, poll(2)s the two fds, and writes to stdin or reads from stdout depending on which one becomes available. For some reason, my Python code experienced weird hangs in this function which can be reproduced as follows:

  1. Start UCRT64 environment.
  2. Install the python3 package. Make sure that which python3 resolves to /usr/bin/python3.
  3. Compile the following test script via GCC/MSVC/whatever into repr.exe:
#include <windows.h>
#include <stdio.h>
char buffer[4096];
int main() {
    DWORD n_read;
    while (ReadFile(GetStdHandle(STD_INPUT_HANDLE), buffer, sizeof(buffer), &n_read, NULL)) {
        fprintf(stderr, "Read %d\n", n_read);
    }
    return 0;
}
  1. Run python3 -c 'import subprocess; subprocess.run("./repr", input=b"\x00" * 65537, stdout=subprocess.PIPE)'.

Expected behavior: the program reads 16 4096-byte chunks and one 1-byte chunk and then the Python interpreter returns.

Actual behavior: the program reads 16 4096-byte chunks and then hangs.

I don't think this is Python's bug, because Python simply does poll+read. The bug cannot be reproduced when using MSYS or MINGW64 environment, or when using native Python in UCRT64. Not in cygwin either. It is precisely the interaction of POSIX Python with native UCRT64 environment that triggers the problem. I'm sort of at loss here, because it doesn't seem like the underlying libc should affect anything, but here we are.

(I'm not sure if this is the right place to report this bug, because it is the result of interaction of several subsystems, so please tell me if I should go elsewhere.)

purplesyringa avatar Feb 19 '24 23:02 purplesyringa

I can reproduce this. Also in Cygwin, so I reported it on IRC.

dscho avatar Feb 26 '24 08:02 dscho

I'm not sure if running MSYS packages in UCRT64 is even supported

Forgot to say: @purplesyringa this is completely within what is supported. Thank you for reporting this bug and providing such an easy reproducer.

dscho avatar Feb 26 '24 09:02 dscho

@dscho, I tried to reproduce this with cygwin 3.5.1, however, could not. Which cygwin version did you try?

tyan0 avatar Mar 01 '24 11:03 tyan0

I also cannot reproduce with MSYS_NT-10.0-19045 HP-Z230 3.4.10--api-345.x86_64 2024-02-15 22:56 UTC x86_64 Msys and MSYS_NT-10.0-19045 HP-Z230 3.4.10.x86_64 2024-02-10 08:39 UTC x86_64 Msys.

tyan0 avatar Mar 01 '24 11:03 tyan0

With this test case, repr.exe is a native win32 binary (not cygwin/msys2 one), and python3 is a cygwin/msys2 binary. Right?

tyan0 avatar Mar 01 '24 11:03 tyan0

Ah, I could reproduce when repr.exe is built with mingw UCRT compiler. If this is compiled with /mingw64/bin/gcc the issue does not occur.

tyan0 avatar Mar 01 '24 11:03 tyan0

Ah, I could reproduce when repr.exe is built with mingw UCRT compiler. If this is compiled with /mingw64/bin/gcc the issue does not occur.

That's strange because I did compile with /mingw64/bin/gcc and it hangs reliably over here.

In any case, thank you for looking into this, @tyan0!

dscho avatar Mar 01 '24 22:03 dscho

I found the cause. Currently, select() call for write-side of a pipe possibly hangs for non-cygwin reader. However, it is difficult to explain why. Please refer to https://github.com/mirror/newlib-cygwin/blob/master/winsup/cygwin/select.cc#L614 for detail.

I will submit a patch shortly for reviewing. Thanks.

tyan0 avatar Mar 03 '24 05:03 tyan0