eio
eio copied to clipboard
Signals abstraction
As discussed in #301 we might need to abstract Signals.
I've got a working prototype that works on uring and libuv but we have to define some things before I go forward.
1. Do we want to export more signals, less or the same as Sys ?
Sys is a bit conservative and I think we should export more, for example it misses SIGWINCH and SIGINFO (linux doesn't have this but it's popular in Unix).
2. Do we want to restrict what we export ?
Like type signum = Sigint | Sigwhatever, or we just keep accepting an int and if the user wants to use whatever other signal he has, he can.
I think we should keep taking an int, but do some discovery ourselves (for the value of SIGWINCH for example) and export that as a
val sigwinch : int
Worth noting that the default ocaml signal interface accepts arbitrary integers so that would be no issue, also not an issue with Luv.
3. What do we do with signals that are not supported ?
Do we want to fail hard, fail silently or give it a "button" to control the behavior, this is likely very relevant for windows as they have only a small set of signals.
Progress
I have something not in a reviewable state, but I think now I understand all the nits for us to decide what to do: https://github.com/haesbaert/eio/tree/signal
The runtime processes signals in a more native way than libuv, I'll try to describe how both work so we can decide what we want, the differences are particularly important for multiple domains.
Ocaml runtime signals + Domains
The runtime doesn't do a lot, basically signals are recorded in a bitmap and processed at some point outside the trampoline.
The signal bitmap is shared between multiple domains, and code is careful enough when accessing it concurrently, so all good.
This also means we fall into standard pthread signals semantics, it's undefined which pthread (therefore which Domain) gets the signal.
This makes signals pretty hard to use with multiple domains, if we wanted to cancel some Fiber for example we would have to maybe cross Domains depending where the signal was delivered.
Libuv is a Good Boi (tm)
Libuv handles signals in a different way, they have their own signal dispatch mechanism (therefore also run signals outside the trampoline). Each signal handle is associated with a loop and each loop is associated with a Domain (in our case).
One signal can have multiple handles and each handle is associated with a loop, so in libuv you can have multiple handlers for SIGINT, one for each pthread and they all do something cool with it, or not.
This is more useful than what the runtime offers, you can establish a handler that will be called from the Domain you are, so you can maybe cancel some Fiber or whatever. Or maybe you actually want every domain to receive the signal and do some cleanup before exit.
Opinion
I believe we should have a similar mechanism like libuv, this would involve creating the dispatcher in the uring backend, which shouldn't be very hard. It also allows us to be more clear about the semantics: "signals get processed in the domain they were installed" instead of "signals get processed on a random Domain, rejoice".
My tree
In my tree I implemented something similar to the runtime semantics:
- uring is like the runtime
- luv is pinning the signal to whatever the first Domain it was installed on
This is more to get things going and it might be a first step for us to define the API and get something in that at least is useful for single domains.
On supported signals
I have added a discovery/config thingy to find the values of signals as Sys is very conservative with the number of signals exported.
I'd like to have signals that we can't guarantee existence in some platform (looking at you Windows) to be an optional type, if you check the code you'll see that Config.Signum.siginfo_opt is an int option, and on Linux it's actually None.
I think this is good enough since it forces the user to acknowledge that the signal he is using may not be present, instead of assuming it is.
On Linux, should we be using signalfd(2) instead? Then they'll get processed in the domain that read the FD.
On Linux, should we be using
signalfd(2)instead? Then they'll get processed in the domain that read the FD.
Oh that's nice, I didn't know about it, I think that simplifies a lot.
At a high level, do we have a sense of what modern consumers of signals actually expect? The lowlevel interface is extremely difficult to get right in the presence of multiple threads, processes and runtimes.
Is there anything useful beyond:
- SIGINT: ctrl-c should work!
- SIGTERM: just exit the program
- SIGUSR1/2: callbacks to a single domain that set them up
If that's it, then we could just expose those behaviours from eio and actually have a hope of making them work in a cross-platform way :-)
At a high level, do we have a sense of what modern consumers of signals actually expect? The lowlevel interface is extremely difficult to get right in the presence of multiple threads, processes and runtimes.
Is there anything useful beyond:
- SIGINT: ctrl-c should work!
- SIGTERM: just exit the program
- SIGUSR1/2: callbacks to a single domain that set them up
SIGWINCH for terminal resizing, SIGINFO for polling program behaviour, SIGHUP for daemon reconfig
If that's it, then we could just expose those behaviours from eio and actually have a hope of making them work in a cross-platform way :-)
I'm in favour of just supporting OCaml callbacks for those usecases specifically, tied to the main domain, as none of them actually need multiple domains as far as I can tell.
- SIGINT: trigger cancellation of all the fibres
- SIGTERM: just exit
- SIGWINCH/INFO/USER1/2 : user callbacks
- SIGHUP : dont think we should support this, it's hard to do in the presence of sandboxing (pledge, privsep) without a reexec.
SIGTERM for multiple domain cleanup is a kinda common thing. Not supporting SIGHUP seems unecessary, I don't see why we should remove this from the user. Cancelling fibers with SIGINT should be one a per-fiber basis and not a global thing (hence why I wrote sigbox). SIGWINCH/INFO/USER1/2 can be done as callbacks, but I still think it's clumsy as the user will need to do the signaling himself, and will very likely due it wrong due to the issues already discussed in the PR for signals.
But I'm not gonna insist :)
SIGTERM for multiple domain cleanup is a kinda common thing. Not supporting SIGHUP seems unecessary, I don't see why we should remove this from the user. Cancelling fibers with SIGINT should be one a per-fiber basis and not a global thing (hence why I wrote sigbox). SIGWINCH/INFO/USER1/2 can be done as callbacks, but I still think it's clumsy as the user will need to do the signaling himself, and will very likely due it wrong due to the issues already discussed in the PR for signals.
But I'm not gonna insist :)
oh and I mean SIGTERM is even more crucial since you might need to be able to do things for a proper cleanup. For example MDNS should send a goodbye message unpublishing records, or any other non trivial software that needs to do some cleanup.
Closed by #436.