crossbeam Support for signals

Hello

Reading the https://github.com/rust-lang-nursery/rust-cookbook/issues/501, I think the background thread is a bit suboptimal solution. Working, yes, but not nice.

So I wonder if it was possible to have some signal-compatible channels. Writing into the channel inside the signal handler is currently not safe AFAIK, because the Waker uses locks. Do you think it could be possible to devise some way to wake threads that doesn't include locks? I know only of the self-pipe trick, but that might be too slow for the wakeup in crossbeam-channel?

Nov 28 '18 20:11 vorner

Do you think it could be possible to devise some way to wake threads that doesn't include locks?

Maybe, but I can't think of a way of doing this properly at the moment. :( Note that even thread::unpark() uses locks so that rules it out.

I know only of the self-pipe trick, but that might be too slow for the wakeup in crossbeam-channel?

I suppose a lock-free wakeup mechanism would have to use select and pipes to park/unpark threads. Moreover, listener registration in Waker would have to avoid locks. And avoiding locks often implies allocation, but allocators internally use locks (not sure if that really is a problem, though).

Long story short, my feeling is that this is a solvable problem in theory, but would be incredibly difficult to do in practice. I'd have to think about it a bit more...

Nov 29 '18 11:11 ghost

I guessed it might be a hard problem and that it probably isn't as desired to warrant making the usual case significantly harder or something like that. So, these are more like brainstorming ideas that might lead nowhere, just in case:

Yes, allocation is also disallowed inside the signal handler. Actually, the list of things one might do is very limited.
However, the allocation is disallowed only inside the signal handler. In other words, the lockfreeness might be asymmetric ‒ allocation in the registration would be fine so long the signal handler could do a wakeup without it.
It probably wouldn't even have to wake up all the threads, just one of the waiting ones ‒ if that one could be somehow told to wake up the others.

Anyway, yes, pipe(s) and either some kind of select or read is the canonical way to wake something from signal handlers. The other is doing IO in the thread and relying on getting EINTR from one of them to wake up or checking in between the IO ‒ the signal handler just sets some kind of flag and leaves it up to EINTR to wake whatever is needed.

Nov 29 '18 14:11 vorner

What if SyncWaker used a RwLock rather than Mutex? Is it then safe for the signal handler to do a .read() lock?

Nov 29 '18 17:11 ghost

Unfortunately, no :-(. For two reason:

That thing (or pthread equivalent) is not listed as async-signal-safe.
The actual reason why it isn't listed is likely as follows. Signals are „parasitic“ in a way. The kernel chooses a thread and injects the signal handler into it at a completely arbitrary place. If that threads happens to be holding the lock (in this case write()) at the time the signal handler happens to it, we get a deadlock ‒ it can't get read(), because there's write() already, but it can't release the write(), because the signal handler gets blocked on the read() and is sitting on top of the stack, waiting.

While it might probably be possible to devise some way to make sure this doesn't happen in a specific case, that would still be likely unportable simply because any function not listed as async-signal-safe can do something (like allocate) on some other POSIX system.

Nov 29 '18 20:11 vorner

Hmm. I just found one thing. The sem_post is signal-safe (pthread_cond_notify isn't). So it could be possible to do the blocking/wakeup using that.

Nov 29 '18 20:11 vorner

Question: is the use of mutexes and condvars allowed inside signals if we make sure that they are not used at the same time within two concurrent invocations of the signal handler? For example:

static SIGNALS: [AtomicBool; ...] = ...;
static LOCK: AtomicBool = ...;

fn handler(sig_id: usize) {
    if SIGNALS[sig_id].swap(true, SeqCst) {
        return;
    }

    'lock: loop {
        // This simple lock makes sure mutexes and condvars are not used at the
        // same time within two concurrent invocations of this signal handler.
        if LOCK.swap(true, SeqCst) {
            return;
        }

        'notify: loop {
            for (id, sig) in SIGNALS.enumerate() {
                if sig.swap(false, SeqCst) {
                    // Might use real mutexes and condvars.
                    send_message_over_channel(id);
                    continue 'notify;
                }
            }
            break;
        }

        LOCK.store(false, SeqCst);

        for sig in SIGNALS {
            if sig.load(SeqCst) {
                continue 'lock;
            }
        }
        break;
    }
}

Apr 22 '19 08:04 ghost

The POSIX whitelists certain functions as async-signal-safe. Using anything outside this whitelist is generally considered UB, but systems may whitelist additional functions (but their use in a signal handler is not portable then). Notably, neither mutexes nor condvars are in this list. So you might get away with it on some OSes and in certain situations, but generally no, they are not allowed de-jure.

Furthermore, the problem is not as much concurrent signal handlers (well, these too are problematic), but the fact that a signal handler „hijacts“ an arbitrary thread to run on. So if you hold a mutex in a the normal thread and a signal comes, sits on top of its stack and tries to lock it too, then you get a deadlock.

In general, I don't think it worth the risks and complexity. I know the code above is just a general suggestion, not fully debugged and final, but still, I suspect there's a race condition when it won't wake up in certain situation, and I'm not 100% sure about not being deadlocky in some cases either.

But I want to get to trying out replacing the Waker internals with the semaphore, which notably is async-signal-safe (well, waking it is). It is however buried deep in my TODO list.

Apr 22 '19 11:04 vorner

Here is an example of "communicating" between threads in the context of signal handling and interrupted threads:

https://github.com/servo/servo/blob/5d479109ef3bdbf0d937f6ea318e9cce8bbde2b3/components/background_hang_monitor/sampler_linux.rs#L186

The corresponding signal handler is found at https://github.com/servo/servo/blob/5d479109ef3bdbf0d937f6ea318e9cce8bbde2b3/components/background_hang_monitor/sampler_linux.rs#L268

The code is actually based on another project: https://bitbucket.org/nikhilm/vignette/src/master/

Both codebases are using techniques from the C++ equivalent: https://dxr.mozilla.org/mozilla-central/rev/3d88030030a181816e5fe300b6b8d66cb718d8ba/tools/profiler/core/platform-linux-android.cpp#320

Further background info can be found at https://medium.com/programming-servo/programming-servo-a-background-hang-monitor-73e89185ce1

The setup uses a semaphor from libc::sem_t for sending "wake-up" messages, as well as AtomicPtr for sending "messages containing data", one could say.

In this setup the flow is like:

Have a thread A send a signal to thread B. We are now in the "critical section" where no allocations can take place and so on.
A blocks on a "wake-up" message from B, received via the semaphore.
B handles signal sent by A, writes some data to shared-state atomically.
B sends the "wake-up" message to A with the semaphore. Finally it blocks on a "wake-up" message that will later be sent by A.
A wakes-up, and reads the data in the shared-state.
A does some work with the shared-state.
A sends a "wake-up" message back to B, and blocks on receiving a final message back from B.
A receives a final message from B, that B has "resumed", and we leave the "critical section" once that message has been received.

So if you're not in control of sending the signal(maybe you want to handle signals sent from other processes or the system itself), like in the above setup, I can imagine a slightly different setup:

Have two threads, A and B.
Have two pieces of shared-state: a atomic "waiting" flag, and the data that you'de like to exchange between the signal handler and the rest of the system, say wrapped in an atomic ptr.
Have thread A set "waiting" to true initially, and wait on a "shared-state ready" message from the semaphore.
Have thread B register signal handlers, and sleep(?) or do something else.
When a signal comes in, Thread B should spin until "waiting" is true.
When it is true, it can manipulate the shared-state, and then post the "shared-state ready" message, then it must itself wait on a "done with shared-state" message on the semaphore.
Thread A wakes up, sets "waiting" to false, and can manipulate the shared-state. When that is done, it sends the "done with shared-state" message, and waits on the "resumed" message.
Thread B wakes up, sends the "resumed" message, and exits the signal-handler.
Thread A wakes up, do whatever it needs to do to propagate the state across the system(use Waker, or just send a message on a crossbeam channel).
When Thread A is done with that, it sets "waiting" back to true, and waits on a message from the semaphore.

So the difference here is that the signal handler needs to know that the other thread is "ready" for the dance to begin, since the signal didn't come from that thread like in the first setup.

May 17 '19 06:05 gterzian

I'll probably have to read this suggestion few more times to understand exactly what you propose here. Anyway, my impression is that this is not what I'm looking for. I may not have stated it explicitly, but my goal here is not to spawn any additional threads for signal handling.

If I relaxed that requirement and allowed for a „shovel“ thread, it could be done in much simpler way. For example, there's probably no reason for the signal handler to wait for the thread A to finish processing ‒ it can simply bump the semaphore and the thread can just keep eating from the semaphore and pumping these „tokens“ into the channel.

Furthermore, there seem to be a lot of little details that can turn this into a hornet's nest. In particular, you have to mask the signal out from A's signal mask (the signal handler just must not be run inside A, or it'll deadlock). If you want to be able to hand multiple signals, you'll either need to know all the signals at once (to mask them all in shared A), or start multiple threads. But the threads won't be able to mask each other's signals (they don't know of each other) so it can happen that they get to handle each other's signals at the same time ‒ again, creating a deadlock. Mandating to know all signals in advance at one place may be suitable for end application, but not for a library.

As for my experiments… I've tried to read crossbeam-channel's code. Unfortunately, the code is quite complex and I'm not sure where to start modifying it. It's no longer a crate friendly for a drive-by contributor that just wants to do this one little thing ☹. I'm still confident this should work in theory, I'm just not sure about performance implications.

May 17 '19 19:05 vorner

my goal here is not to spawn any additional threads for signal handling

On second thoughts, I think actually the concept of "waiting" I introduced wouldn't work in practice, because the problem is that once a thread is suspended by a signal(and now inside the signal-handler), you've entered a "critical section" and shouldn't do any allocations. So the "shovel" thread could still be doing work, while another thread would have been suspended and be "waiting" for it to finish. That doesn't work because by then we're already in a critical section, and the shovel thread could deadlock doing any allocations, meaning the signal-handler could then be stuck waiting as well...

So the example in Servo and elsewhere I linked to above only works because thread A is the one sending the signal to thread B, so the "shovel" thread A knows when the critical section starts, because it is the one sending the signal, after which it just needs to wait for the first message on the semaphore coming from the signal-handler.

So I think you could communicate to a "receiver" that "a signal has been handled", from within the signal handler, and you'd have to do everything with atomics. You'd have to also account for the fact that a previous "message" might not have been handled yet by the time the next signal is handled.

May 20 '19 04:05 gterzian