wg-async
wg-async copied to clipboard
when cancellation goes wrong
- Brief summary: I've heard a lot of people discuss hazards related to task cancellation -- I think this is because any
awaitpoint is effectively a cancellation point, and people don't always write "cancellation-safe" code. I'd love to hear more concrete examples of where this happens. - Character: not sure!
- Key points or morals (if known): not sure!
Part of this blog post feels (maybe) related:
https://kevinhoffman.medium.com/rust-async-and-the-terrible-horrible-no-good-very-bad-day-348ebc836274
Lesson III — If you drop (or allow to be dropped) the tokio runtime, any pending tasks are cancelled.
AsyncDrop would likely help with a lot of the pain of unexpected/external drops, I think. It's really awkward to work around the current inability to do async stuff in Drop
For intentionally canceling futures and streams, it would be really nice if something like https://crates.io/crates/stop-token were standardized
@jbr I think it would help to figure that out if we had as many concrete examples as we can. One thing I'd like to drill into a bit is the connection of cancellation and this, because we may not always be able to do async drop in cancellation.
Another relevant story, https://tomaka.medium.com/a-look-back-at-asynchronous-rust-d54d63934a1c:
async fn read_send(file: &mut File, channel: &mut Sender<...>) { loop { let data = read_next(file).await; let items = parse(&data); for item in items { channel.send(item).await; } } }
Each await point in asynchronous code represents a moment when execution might be interrupted, and control given back to the user of this future. This user can, at their will, decide to drop the future at this point, stopping its execution altogether.
If the user calls the read_send function, then polls it until it reaches the second await point (sending on a channel), then destroys the read_send future, all the local variables (data, items, and item) are silently dropped. By doing so, the user will have extracted data from file, but without sending this data on channel. This data is simply lost.
I've written https://gist.github.com/Matthias247/ffc0f189742abf6aa41a226fe07398a8 2 years ago :-)
For intentionally canceling futures and streams, it would be really nice if something like https://crates.io/crates/stop-token were standardized
Agreed. I wanted to look into this, and think we could have a StopToken similar to the C++ one which is synchronously and asynchronously usable. But I lacked time of further looking into it, and am currently afraid we won't get it standardized since every async framework wants to have their own version (as it has happened with channels, mutexes, etc).
Hmm, I would really like to see these stories written up ...
There is also an old issue on futures about cancellation especially in the context of completion-based APIs: https://github.com/rust-lang/futures-rs/issues/1278 . That probably also contains some useful information.
After a great writing session about this issue, we unfortunately did not end up with a finished story, but we've come up with several observations which we think will lead to at least two status quo stories.
Issues around cancellation
Accidental cancellation
As @seanmonstar pointed out above, sometimes futures are dropped before they are polled to completion. If code is not written explicitly with this in mind, surprising results can occur. Because each await point gives control back to the caller, the author of such code must keep in mind that code after an await point might not run. For example,
async fn push_event(&self) {
let event = self.events.recv().await; // Control is given back to the runtime after this call
self.events.push(event); // If the future returned from `push_event` is dropped early this line may never run
}
Premature dropping of futures is often given with examples of the select! macro (e.g., this example @tomaka), but this is not the only case where this can happen. Futures must be written in a pre-poll-ready drop-safe manner.
This is documented in several places, but may be hard to understand at first.
No standard way to perform cancellation of futures
Cancellation of tasks can be achieved in several ways:
- Dropping futures: as discussed above, dropping a future means it no longer is driven forward to completion. Some issues with this:
- This however has no impact on associated operations of that future (e.g., I/O operations the future has started) which remain blissfully unaware that the user wishes to cancel them.
- A particular piece of code may wish to cancel a future without having ownership over it.
- Explicit state: A future can keep track of its state as being either canceled or note and have some mechanism for flipping that state. For example, on creating the future, the user receives some sort of mechanism (usually a channel) for signaling to the future that it should flip its state from not-canceled to canceled. Issues with this:
- There is no standard way of doing this and so each future is left to implement this on their own in a bespoke way. This can get complicated when trying to signal to associated operations that the should be canceled. There are, however, some crates which try to offer generalized approaches like stop-token.
Cancellation must often be explicit
Even if a user wants to use Drop as a way to cancel an operation, this is often not possible. Some operations need to perform some async cleanup before being canceled. For instance, tokio's channel implementation offers an explicit close method to first close the channel from receiving further elements after which the user can drain any remaining elements that are in the queue. Because Drop is sync, there would be no way to do this.
Documentation on how to handle cancellation is sparse
The async book currently has an empty entry for this. Because cancellation is often modeled in a bespoke manner, it's hard to learn of one way to handle this.
The plan
With this in mind, we had two ideas for stories:
- A user writes code where a future accidentally gets dropped early causing some code to not run which the user thought would run. This might be best for Alan as understanding this issue requires understanding how futures are actually run.
- A user needs to model cancellation but they're unsure how to do it. We've not been able to come up with a good example of this that hits most of the points above. If you've run into this issue, please do reach out.
Thanks to @doc-jones @emmanuelantony2000 @zeenix, and @sdroege for a great session!
A related story was recently posted. https://gendignoux.com/blog/2021/04/08/rust-async-streams-futures-part2.html
The author @gendx might be able to share some more concrete examples.
We had a session today that addressed “accidental cancellation”. There should be a PR coming soon.
Hey 👋 Just opened a PR, I'd be happy to polish it 😊
Just opened a PR, I'd be happy to polish it 😊
Can you please link to the PR? Thanks.
Hey, I think it was #153, all good!