eio Signals abstraction

As discussed in #301 we might need to abstract Signals.

I've got a working prototype that works on uring and libuv but we have to define some things before I go forward.

1. Do we want to export more signals, less or the same as Sys ?

Sys is a bit conservative and I think we should export more, for example it misses SIGWINCH and SIGINFO (linux doesn't have this but it's popular in Unix).

2. Do we want to restrict what we export ?

Like type signum = Sigint | Sigwhatever, or we just keep accepting an int and if the user wants to use whatever other signal he has, he can. I think we should keep taking an int, but do some discovery ourselves (for the value of SIGWINCH for example) and export that as a val sigwinch : int Worth noting that the default ocaml signal interface accepts arbitrary integers so that would be no issue, also not an issue with Luv.

3. What do we do with signals that are not supported ?

Do we want to fail hard, fail silently or give it a "button" to control the behavior, this is likely very relevant for windows as they have only a small set of signals.

Sep 20 '22 10:09 haesbaert

Progress

I have something not in a reviewable state, but I think now I understand all the nits for us to decide what to do: https://github.com/haesbaert/eio/tree/signal

The runtime processes signals in a more native way than libuv, I'll try to describe how both work so we can decide what we want, the differences are particularly important for multiple domains.

Ocaml runtime signals + Domains

The runtime doesn't do a lot, basically signals are recorded in a bitmap and processed at some point outside the trampoline. The signal bitmap is shared between multiple domains, and code is careful enough when accessing it concurrently, so all good. This also means we fall into standard pthread signals semantics, it's undefined which pthread (therefore which Domain) gets the signal.

This makes signals pretty hard to use with multiple domains, if we wanted to cancel some Fiber for example we would have to maybe cross Domains depending where the signal was delivered.

Libuv is a Good Boi (tm)

Libuv handles signals in a different way, they have their own signal dispatch mechanism (therefore also run signals outside the trampoline). Each signal handle is associated with a loop and each loop is associated with a Domain (in our case). One signal can have multiple handles and each handle is associated with a loop, so in libuv you can have multiple handlers for SIGINT, one for each pthread and they all do something cool with it, or not.

This is more useful than what the runtime offers, you can establish a handler that will be called from the Domain you are, so you can maybe cancel some Fiber or whatever. Or maybe you actually want every domain to receive the signal and do some cleanup before exit.

Opinion

I believe we should have a similar mechanism like libuv, this would involve creating the dispatcher in the uring backend, which shouldn't be very hard. It also allows us to be more clear about the semantics: "signals get processed in the domain they were installed" instead of "signals get processed on a random Domain, rejoice".

My tree

In my tree I implemented something similar to the runtime semantics:

uring is like the runtime
luv is pinning the signal to whatever the first Domain it was installed on

This is more to get things going and it might be a first step for us to define the API and get something in that at least is useful for single domains.

On supported signals

I have added a discovery/config thingy to find the values of signals as Sys is very conservative with the number of signals exported. I'd like to have signals that we can't guarantee existence in some platform (looking at you Windows) to be an optional type, if you check the code you'll see that Config.Signum.siginfo_opt is an int option, and on Linux it's actually None. I think this is good enough since it forces the user to acknowledge that the signal he is using may not be present, instead of assuming it is.

Sep 23 '22 15:09 haesbaert

On Linux, should we be using signalfd(2) instead? Then they'll get processed in the domain that read the FD.

Sep 26 '22 14:09 talex5

On Linux, should we be using signalfd(2) instead? Then they'll get processed in the domain that read the FD.

Oh that's nice, I didn't know about it, I think that simplifies a lot.

Sep 28 '22 15:09 haesbaert

At a high level, do we have a sense of what modern consumers of signals actually expect? The lowlevel interface is extremely difficult to get right in the presence of multiple threads, processes and runtimes.

Is there anything useful beyond:

SIGINT: ctrl-c should work!
SIGTERM: just exit the program
SIGUSR1/2: callbacks to a single domain that set them up

If that's it, then we could just expose those behaviours from eio and actually have a hope of making them work in a cross-platform way :-)

Dec 15 '22 11:12 avsm

At a high level, do we have a sense of what modern consumers of signals actually expect? The lowlevel interface is extremely difficult to get right in the presence of multiple threads, processes and runtimes.

Is there anything useful beyond:

SIGINT: ctrl-c should work!

SIGTERM: just exit the program

SIGUSR1/2: callbacks to a single domain that set them up

SIGWINCH for terminal resizing, SIGINFO for polling program behaviour, SIGHUP for daemon reconfig

If that's it, then we could just expose those behaviours from eio and actually have a hope of making them work in a cross-platform way :-)

Dec 15 '22 11:12 haesbaert

I'm in favour of just supporting OCaml callbacks for those usecases specifically, tied to the main domain, as none of them actually need multiple domains as far as I can tell.

SIGINT: trigger cancellation of all the fibres
SIGTERM: just exit
SIGWINCH/INFO/USER1/2 : user callbacks
SIGHUP : dont think we should support this, it's hard to do in the presence of sandboxing (pledge, privsep) without a reexec.

Dec 21 '22 21:12 avsm

SIGTERM for multiple domain cleanup is a kinda common thing. Not supporting SIGHUP seems unecessary, I don't see why we should remove this from the user. Cancelling fibers with SIGINT should be one a per-fiber basis and not a global thing (hence why I wrote sigbox). SIGWINCH/INFO/USER1/2 can be done as callbacks, but I still think it's clumsy as the user will need to do the signaling himself, and will very likely due it wrong due to the issues already discussed in the PR for signals.

But I'm not gonna insist :)

Dec 21 '22 21:12 haesbaert

SIGTERM for multiple domain cleanup is a kinda common thing. Not supporting SIGHUP seems unecessary, I don't see why we should remove this from the user. Cancelling fibers with SIGINT should be one a per-fiber basis and not a global thing (hence why I wrote sigbox). SIGWINCH/INFO/USER1/2 can be done as callbacks, but I still think it's clumsy as the user will need to do the signaling himself, and will very likely due it wrong due to the issues already discussed in the PR for signals.

But I'm not gonna insist :)

oh and I mean SIGTERM is even more crucial since you might need to be able to do things for a proper cleanup. For example MDNS should send a goodbye message unpublishing records, or any other non trivial software that needs to do some cleanup.

Dec 21 '22 21:12 haesbaert

Closed by #436.

Feb 06 '23 15:02 talex5

eio eio copied to clipboard

Signals abstraction

1. Do we want to export more signals, less or the same as Sys ?

2. Do we want to restrict what we export ?

3. What do we do with signals that are not supported ?

Progress

Ocaml runtime signals + Domains

Libuv is a Good Boi (tm)

Opinion

My tree

On supported signals

eio
eio copied to clipboard