lilos Could lilos_handoff::Push take &mut Option<T> to be cancel safe?

This way the pointer could be directly provided so the T isn't taken out unless the future succeeds.

Jul 29 '23 17:07 mattfbacon

I believe while this fixes the issue with dropping the future returned by Push::push, API consumers cannot avoid being weakly cancel-safe themselves. To use such an API, the API consumer would need to have space for the T in its own future (or call stack). As soon as this future is cancelled, the T is dropped instead of being processed.

As I understand the problem, we want to perform two operations: generating some value and sending this value to a different task. With the current API is that we always need to generate the value before sending it. Ideally, we would only generate the value when we can be sure to succeed in sending it. lilos::spsc::Permit solves this neatly, as we can obtain the permit before generating the value. Once we obtain the permit, we are assured that there is space in the queue and sending the value will succeed without having to await. Therefore, we can generate and send the value in a single invocation of poll.

Sep 01 '23 08:09 korrat

I have to think about that. I wonder it it could actually be done soundly.. if the pop end is waiting, we get a permit. Then the pop future is cancelled and forgotten (no Drop), but the permit would still seem valid. I think this one-step approach is the only valid option.

That said, maybe an API that takes make_value: impl FnOnce() -> T would work?

Sep 01 '23 17:09 mattfbacon

Okay so I clearly need to fix my Github notification settings. Hello from like seven months later!

On @korrat's point - I tried rewriting the handoff API to take a closure, but it turns out to be incredibly subtle -- since the closure needs to be called from the taker's context it wasn't clear that it could be done. Or at least, I couldn't figure out how to do it. I'm definitely open to insights.

@mattfbacon - I think your push_from operation is clever. I need to do some testing in my codebases to see what the generated code looks like. I feel like it addresses 90% of my cancel correctness concerns with handoff -- while @korrat is correct that it doesn't save the caller from doing a potentially destructive or expensive operation to produce the T, it does at least let the caller "put it back" on cancellation of the pusher future.

The fact that some_handoff.push_from(&mut None) compiles is a bit unfortunate. Your decision to treat it as a no-op in #11 is one of several ways to address that, and has the advantage of generating very little code compared to a panic. I think your approach may be reasonable.

Mar 02 '24 21:03 cbiffle

After thinking about @korrat's analysis in #14, I'm less confident about push_from, because it isn't sufficient to make the transfer truly cancel-correct. The sequence of events in question:

Task A parks at pop waiting for rendezvous.
Task B executes push_from, which immediately completes because A was waiting. B has lost access to the item it's sending.
Before resolving the pop future, Task A cancels it. This could be a result of e.g. popping from the handoff within a select_biased! that chooses an earlier branch and bails. The item is now lost with no indication.

There are various small changes I could imagine to address one piece or the other -- for instance, we could maintain a "data lost" flag so that there's at least some indication that this happened, though I don't see how I would use that in practice to fix an application.

I am currently leaning toward concluding that handoff, while super useful, is inherently cancel-unsafe, and can only be used in cases where data loss is tolerable (rare) or where both sides are careful not to cancel the push/pop futures (more common).

I've also tested the associated PR on some of my firmware, and it increases code size by a small but measurable amount. I'm investigating the causes. I suspect that the compiler is unable to entirely eliminate the operations on the Option, which is odd since it appears to be totally inlining push_from.

Given (1) the realization that a minimum-copy handoff may be inherently cancel-unsafe, and (2) the increase in lines-of-code and bytes-in-flash, I'm now leaning toward not merging this PR in its current form.

I'm definitely open to alternatives, or to being convinced otherwise.

Mar 02 '24 23:03 cbiffle

Ah, good point, it's cancel-safe in the case where the push_from happens before the pop, but not vice versa, because of the PopWait variant. Maybe I'm missing something, but why does the popper need to provide the location in this variant? Why couldn't the popper wait until the PushWait condition occurs, then take it from that?

Mar 04 '24 02:03 mattfbacon

lilos lilos copied to clipboard

Could lilos_handoff::Push take &mut Option<T> to be cancel safe?

lilos
lilos copied to clipboard