tokio
tokio copied to clipboard
Tasks should drop the waker when they complete
Currently the JoinHandle waker is stored in the task struct until every ref-count to the task is dropped. Therefore, you can have a ref-cycle if two tasks await each other's JoinHandle
. For example, this leaks:
use tokio::task::JoinHandle;
fn main() {
let rt = tokio::runtime::Builder::new_current_thread().build().unwrap();
let (send, recv) = tokio::sync::oneshot::channel::<JoinHandle<()>>();
let jh = rt.spawn(async move {
let jh2 = recv.await.unwrap();
jh2.await.unwrap();
});
let jh2 = rt.spawn(async move {
jh.await.unwrap();
});
send.send(jh2).unwrap();
rt.block_on(async {
for _ in 0..10 {
tokio::task::yield_now().await;
}
});
}
Here's a proposal for fixing this:
The JoinHandle waker
There are two bits that control access to the JoinHandle waker field: JOIN_WAKER and JOIN_INTEREST. The rules are as follows:
If JOIN_INTEREST is one, then the JoinHandle may read from the waker field.
If the JoinHandle set the JOIN_WAKER field to zero, then the JoinHandle may mutate the waker field.
If the runtime set the JOIN_WAKER field to zero and JOIN_INTEREST is one, then the runtime may read from the waker field.
If the runtime set the JOIN_WAKER field to zero and JOIN_INTEREST is zero, then the runtime may mutate the waker field.
The JOIN_WAKER field starts out as a zero. The initial zero counts as having been set by the JoinHandle.
The runtime may only set the JOIN_WAKER field to zero when COMPLETE is one. This ensures that if the JoinHandle ever sees both JOIN_WAKER and COMPLETE being zero, then the JoinHandle knows that it may mutate the waker field.
To show that the above never results in any race conditions, we consider the two cases where the waker field may be modified:
- If the JoinHandle mutates the waker, then the JoinHandle has set JOIN_WAKER to zero. The runtime never accesses the field unless the runtime is the one who set it to zero, so this ensures that the runtime will not race on the field.
- If the runtime mutates the waker, then the JOIN_INTEREST field is zero and the runtime is the one who set JOIN_WAKER to zero. However, the JoinHandle can only access the field if JOIN_INTEREST is one, or if the JoinHandle set the JOIN_WAKER field to zero. Therefore, the JoinHandle cannot race on the waker field.
Normally, the waker is destroyed by the JoinHandle when the JoinHandle is destroyed. The JoinHandle does this by setting JOIN_WAKER and JOIN_INTEREST to zero, and then destroying the waker. However, if JOIN_WAKER is already zero when this happens, then the JoinHandle does not perform any cleanup of the waker. This situation only happens if no waker is set, or if the runtime is concurrently waking the waker.
When the runtime wakes the JoinHandle, it does this by first setting JOIN_WAKER to zero to obtain access to the waker field. If JOIN_WAKER was already zero, then nothing happens. Once the runtime has woken the waker, it tries to set the JOIN_WAKER bit back to true, but in a way that fails if JOIN_INTEREST is zero. In that case, the runtime now knows it can mutate the waker and the runtime will destroy it.
@Darksonn I would be interested in having a go at this issue.
I'm slowly getting my head around what currently exists and how that relates to the proposed solution.
My understanding is that the new part of this proposal is only the last sentence. That when the runtime wakes the waker and tries to set the JOIN_WAKER
flag back to 1
, but finds that JOIN_INTEREST
is 0
, it will destroy the waker on the task. And that this would all happen somewhere from RawTask::try_read_output
down to can_read_output
in the harness
module. It looks to me like everything else described here is already implemented.
Am I a long way off?
I already have a 75% complete implementation of this.
Oh cool. I'd be interested in seeing it when it's done. I feel I might understand some of it.