lwt icon indicating copy to clipboard operation
lwt copied to clipboard

EBADF from Lwt_process.with_*

Open mroch opened this issue 2 years ago • 4 comments

I've received reports of Unix.Unix_error(Unix.EBADF, "close", "") exceptions that stem from this process#close:

https://github.com/ocsigen/lwt/blob/7355895d89e11aed36d2348380029c57d3124aec/src/unix/lwt_process.cppo.ml#L350

Unfortunately I don't have a repro yet to understand what the process we're calling is doing, but we're calling Lwt_process.with_process_full here:

https://github.com/facebook/flow/blob/38b41863de17211ae70c8bbc57854fbe7fed1ce3/src/common/lwt/lwtSysUtils.ml#L54

According to close(2), EBADF in this case means that the file descriptor that Lwt created internally is invalid. Would it make sense to catch and suppress this exception?

Whatever the underlying exception is is being swallowed. Even if close's exception is ignored, it'd be nice to rethrow the original exception with its original backtrace if possible.

mroch avatar Jun 24 '22 16:06 mroch

Hi @mroch and thanks for the report,

Turns out @MisterDA quickly fixed a bug that's probably the root cause of the issue you encountered. Is the issue still ongoing with the #957 merged and now released?

raphael-proust avatar Jun 29 '22 12:06 raphael-proust

i don't think it was the root cause because (according to that PR) it broke in 9de01c9f8ddb5f4956f1473b657c055c1974642f which was in 5.6.0, but we're using 5.5.0.

i'm going to use a workaround in Flow to see if I can figure out what the underlying issue that causes the invalid FD is, so that we can hopefully find a repro/write a test.

mroch avatar Jun 29 '22 13:06 mroch

Ok, thanks for the info. I'll try to have a look soon to check whether we could forward the original exception and what other issue might be the root cause.

raphael-proust avatar Jun 29 '22 13:06 raphael-proust

circling back with results from our workaround. these are the two underlying exceptions we've gotten reports of so far (still no actual repro unfortunately):

Unix.Unix_error(Unix.EBADF, "read", "")
Raised by primitive operation at Lwt_unix.read_bigarray.(fun) in file "src/unix/lwt_unix.cppo.ml", line 681, characters 8-59
Called from Lwt_unix.retry_syscall in file "src/unix/lwt_unix.cppo.ml", line 500, characters 13-24
Re-raised at LwtSysUtils.command_result_of_process.(fun) in file "flow/src/common/lwt/lwtSysUtils.ml", line 36, characters 2-160
Re-raised at Exception.reraise in file "flow/src/hack_forked/utils/core/exception.ml", line 36, characters 33-76
Called from Lwt.Sequential_composition.backtrace_bind.create_result_promise_and_callback_if_deferred.callback in file "src/core/lwt.ml", line 1927, characters 23-26
Re-raised at LwtSysUtils.with_process_full.(fun) in file "flow/src/common/lwt/lwtSysUtils.ml", line 73, characters 6-70
Re-raised at LwtSysUtils.with_process_full.(fun) in file "flow/src/common/lwt/lwtSysUtils.ml", line 69, characters 2-227
Re-raised at Watchman.get_sockname.(fun) in file "flow/src/hack_forked/watchman/watchman.ml", line 384, characters 2-934
Re-raised at FileWatcher.WatchmanFileWatcher.watchman#wait_for_init.go_exn.(fun) in file "flow/src/monitor/fileWatcher.ml", line 534, characters 10-1023
...
Unix.Unix_error(Unix.EBADF, "set_nonblock", "")
Raised at LwtSysUtils.command_result_of_process.(fun) in file "flow/src/common/lwt/lwtSysUtils.ml", line 36, characters 2-160
Re-raised at Exception.reraise in file "flow/src/hack_forked/utils/core/exception.ml", line 36, characters 33-76
Called from Lwt.Sequential_composition.backtrace_bind.create_result_promise_and_callback_if_deferred.callback in file "src/core/lwt.ml", line 1927, characters 23-26
Re-raised at LwtSysUtils.with_process_full.(fun) in file "flow/src/common/lwt/lwtSysUtils.ml", line 73, characters 6-70
Re-raised at LwtSysUtils.with_process_full.(fun) in file "flow/src/common/lwt/lwtSysUtils.ml", line 69, characters 2-227
Re-raised at Watchman.get_sockname.(fun) in file "flow/src/hack_forked/watchman/watchman.ml", line 384, characters 2-934
Re-raised at FileWatcher.WatchmanFileWatcher.watchman#wait_for_init.go_exn.(fun) in file "flow/src/monitor/fileWatcher.ml", line 534, characters 10-1023
...

both are coming from this code:

let command_result_of_process process =
  (* Wait for it to finish *)
  let%lwt status = process#status
  and stdout = Lwt_io.read process#stdout
  and stderr = Lwt_io.read process#stderr in
  ...

this is the f to with_process_full, so there's been no exception that would cause make_with to call process#close yet. I can't figure how process#stdout or process#stderr could already be closed.

mroch avatar Jul 20 '22 14:07 mroch