ipc icon indicating copy to clipboard operation
ipc copied to clipboard

ipc_*: The (very few) blocking APIs' behavior on signal = inconsistent, some uninterruptible.

Open ygoldfeld opened this issue 1 year ago • 0 comments

In most of Flow-API, APIs are non-blocking. There are various key async APIs, but that's not blocking. There are however a handful of potentially-blocking APIs. Ideally these should behave in some reasonable way on signal, so that at least it would be possible to interrupt a program on SIGTERM while it is blocking inside such an API. In POSIX generally blocking I/O functions will emit EINTR in this situation. We should do something analogous; or failing that at least something predictable and consistent; or failing that present mitigations (but in that case not resolve this ticket).

Here are the existing blocking APIs as of this writing.

  • Concept: Persistent_mq_handle::[timed_]send(), [timed_]receive(), [timed_]wait_sendable(), [timed_]wait_receivable(). Impls:
    • Posix_mq_handle::*()
      • Internally this epoll_wait()s. Signal => EINTR; so we will also emit Error_code around EINTR.
      • Assessment: Pretty good, in that it is interruptible with minimum fuss.
    • Bipc_mq_handle::*()
      • Internally this (via some Boost code, long story) pthread_cond_[timed]wait()s. How this will act on signal is unclear; as of this writing I have not checked or do not remember also what the Boost code does in wrapping the pthread_ call. But the call itself is documented to yield no EINTR ever in some man pages, and EINTR only from pthread_cond_timedwait() (not the one without timed_ though) in some other man pages.
      • Assessment: Unclear, but not good. It will almost certainly not work in some cases and will just keep waiting.
  • struc::Channel::sync_request() (+ future likely similar API(s) including sync_expect_msg())
    • Internally this uses promise::get_future().wait[_for]() which almost certainly internally in turn uses pthread_cond_[timed]wait().
    • Assessment: Probably similar to Bipc_mq_handle situation but would need to be verified.
  • Nota bene: sync_connect() (in socket-stream and client-session) are today formally non-blocking, because it's non-networked IPC; but there are plans to make it networked-IPC and those guys in that case will become potentially-blocking. At that point we could leverage built-in EINTR probably; details TBD. Just keep it in mind.

Ideally all three -- and any future blocking impls -- should get interrupted if-and-only-if a built-in blocking call such as ::read() or ::poll() would be interrupted, on signal. How they're interrupted is less important: whether it's EINTR or another Error_code, as long as it's documented (that said perhaps it's best to standardize around an error::Code of ours we'd add for this purpose).

I'll forego explaining how to impl this. The Posix_mq_handle guy is already there; but the other two are a different story. (For Bipc_mq_handle one could somehow leverage the already-existing Persistent_mq_handle::interrupt_*() APIs. But, then another thread would be necessary. Food for thought at any rate.) (For Channel, not sure, but we should be able to swing something; maybe an intermediary thread; signal => satisfy the promise.)

Lastly, but very much not leastly (sorry), users should be aware of the following mitigating facts, until this ticket is resolved:

  • Persistent_mq_handle:
    • This concept is not really primarily meant for direct use. It can be -- it is nicer than using the POSIX MQ or bipc::message_queue APIs directly, and it adds features including interruptibility to the latter -- but that's a bonus really; these exist to implement Blob_stream_mq_sender/receiver higher-level APIs, and those work just fine (there is no signal-interrupt issue): they (internally) avoid blocking calls except .wait_able(), and the latter is .interrupt_()ed if needed (namely from dtor).
    • If you do use it:
      • interrupt_*() is available to you. So if it is necessary to interrupt a wait or blocking-send/receive, then you can use a separate thread to run it in and invoke interrupt_*() from a signal handler. This isn't no-fuss, but it does work if needed at least.
      • With Posix_mq_handle, you do get the nice normal EINTR behavior already.
      • timed_*() are also available: You could use those with a short, but not-too-short, timeout -- in which case a signal can affect them but only after a bit of lag; could be fine. This is the usual approach... not perfect but doable and pretty simple.
  • struc::Channel::sync_request():
    • This guy is only blocking if locally there is no expect_msg[s]() pending on the opposing side (for future speculated sync_expect_msg() API: if there is no already-received-and-queued-inside in-message of the proper type, or one isn't forthcoming immediately).
    • But in many -- most -- cases protocols are set up in such a way as to in fact make sync_request() non-blocking: It is expected the opposing side has set-up their expect_msg[s]() by the time we issue the sync_request(). That's just good for responsive IPC; and it's nice not to need an async call like async_request(), where a non-blocking one would do much more simply.
      • For example we use sync_request() internally in other parts of Flow-IPC this way routinely.

ygoldfeld avatar Mar 12 '24 03:03 ygoldfeld