Exceptions raised by signal handlers not properly handled?
From https://discuss.ocaml.org/t/lwt-how-to-catch-exceptions-raised-by-signal-handlers/565:
open Lwt.Infix let () = Lwt_unix.on_signal Sys.sigint (fun _ -> failwith "Caught SIGINT") |> ignore; let p = Lwt.finalize (fun () -> Lwt_unix.sleep 2. >>= fun () -> raise Pervasives.Exit) (fun () -> prerr_endline "handled exception"; Lwt.return ()) in Lwt_main.run p (* ocamlfind opt -linkpkg -package lwt.unix foo.ml && ./a.out *)This behaves differently depending on whether you interrupt it with Ctrl+C, or let it raise
Exit.
I'll investigate this and have a stab at it.
Back from very low availability … I'll investigate now.
Notes:
failwithis not an Lwt primitive, howeverraiseisn't either, andfinalizeis implemented astry_bind(src/core/lwt.ml:2071) which wraps calls to the main part in atry-with(src/core/lwt.ml:1970)
The issue seems to be that when a signal is caught, the callback is executed in a different promise than that created by the main part of the call to finalize.
It might be tempting to ask of Lwt to propagate the exception to “the right place”, but there is no such thing as the right place. In this example, there is one single promise sleeping, but in practice, what promise (promises?) should the exception be propagated to?
One possible alternative would be a sort of try_bind dedicated specifically to handling UNIX signals. Something like one of the following:
Lwt_unix.with_signal: (unit -> 'a Lwt.t) -> (int -> 'a Lwt.t option) -> 'a Lwt.t
Lwt_unix.with_signal: ?(sigint: unit -> 'a Lwt.t) -> ?(sigkill: unit -> 'a Lwt.t) -> … -> (unit -> 'a Lwt.t) -> 'a Lwt.t
(The option in the first version is to allow to not catch some signals (same as the fact that each handler is optional in the second version).
Any opinions on whether this should exist? What form it should have? Etc.?
Temporary workaround using wait
open Lwt.Infix
let () =
let w,u = Lwt.wait () in
Lwt_unix.on_signal
Sys.sigint
(fun _ ->
Lwt.wakeup_exn u (Failure "Caught SIGINT"))
|> ignore;
let p =
Lwt.finalize
(fun () ->
(Lwt_unix.sleep 2. >>= fun () -> raise Pervasives.Exit)
<?>
w
)
(fun () ->
prerr_endline "handled exception";
Lwt.return ())
in
Lwt_main.run p
I'd rather avoid a special version of try_bind just for signals. It won't solve the problem of surprising behavior of the existing try_bind, but be another thing to learn.
I think we should avoid executing in a promise terminology. Promises aren't containers for execution, they are just reference cells that help with synchronizing things that are executing (callbacks in the case of Lwt). The new manual draft will help to clear this up :)
I believe the issue is due to how the OCaml runtime executes signal handlers. The answer may be to handle signals better around I/O calls (Lwt_unix.sleep in the above example). The I/O call should know which promise to stuff any resulting exception into.
I still haven't looked at it deeply, so it's possible this is unsatisfactory for some other reason.
And welcome back :)
The upcoming manual should probably be precise about what Lwt_unix.on_signal does. Something along the line of “the callback triggered by the signal is not related to any specific promise – even if on_signal is call ‘within’ a promise.” And then it should probably also mention this workaround as an example of how to send info about a signal being received ‘to’ some promise.
Is there a draft of the new manual somewhere? Or an issue that tracks the progress of the new manual?
Thanks!
Also, this issue should be re-tagged from bug into documentation.
Agreed on the notes to be added to the manual. There isn't really a reviewable draft posted yet. There is a project instead of an issue: https://github.com/ocsigen/lwt/projects/2. And agreed on the tags. I added the docs tag. I'm thinking to keep the bug tag for now, as I'm not sure if this is also a design bug :)
Is there any update on this? Any better/cleaner examples that will work for handling signals?
@XVilka The current status of this issue is summarized above and by @gaborigloi in https://discuss.ocaml.org/t/565/9. I think the overall state for now is that this is difficult to solve and requires research, and there aren't enough "labor resources" available to do it, given other problems in Lwt and elsewhere.