lwt icon indicating copy to clipboard operation
lwt copied to clipboard

Lwt_unix.waitpid on Windows can block lwt

Open gabelevi opened this issue 7 years ago • 1 comments

On Windows, Lwt_unix.waitpid just delegates to Unix.waitpid, even without the Unix.WNOHANG flag. That means the process will make a blocking system call and the whole Lwt event loop gets stuck.

Looking at the code, I see Lwt relies on sigchld to drive the blocking waitpid calls, which obviously doesn't work on Windows (sidenote - this can actually cause many blocking waitpid commands to exit with EINTR when compiled with Lwt, since sigchld is usually ignored by default).

While it would be great if this could be fixed, at the very least you should document this behavior. You might even want to throw for blocking waitpid calls, since debugging why Lwt gets stuck is pretty tricky, and people can just call Unix.waitpid themselves if they really want that behavior.

Personally, I'll probably just switch to a loop, using WNOHANG and a sleep. I'm not an Lwt expert, but that might work for the Windows Lwt_unix.waitpid implementation too.

gabelevi avatar Nov 02 '17 15:11 gabelevi

Thanks for the report!

I've adjusted the docs (see the attached commit). We actually need to rewrite the Lwt_unix docs completely, they are very much inadequate.

For a Windows implementation of waitpid, we should probably wait on the process handle using one of the Win32 waiting functions. Lwt is perpetually behind on Windows, however, so I think we will want to address that organizational (or design) problem more generally first.

Any suggestions for a more immediate fix are welcome, however :) I think we should avoid a sleep-loop in Lwt itself, though, because overestimating sleeps, that repeatedly wake up, are likely not to be right for every Lwt user. We should probably go for a more "exact" solution.

  • [ ] Add tests for Lwt_unix.waitpid
    • On Windows, use timeout 2 > nul to sleep for one second.
  • [ ] Run Unix.waitpid in an Lwt job.
    • Due to how Lwt works, this probably means creating a fresh implementation. Look in win32unix. Perhaps, we can just call the underlying implementation function directly.

aantron avatar Nov 02 '17 15:11 aantron