console icon indicating copy to clipboard operation
console copied to clipboard

completed task shows warning for having lost it's waker

Open cedricje opened this issue 2 years ago • 8 comments

What crate(s) in this repo are involved in the problem?

tokio-console

What is the issue?

tokio-console shows a warning for a task having lost it's waker. But the task should be finished. I think it's related to the task spawning an other task

How can the bug be reproduced?

The problem can be reproduced with the following sample application. Connect to 127.0.0.1:5555 and it will spawn 2 tasks. The first one should be completed after 1 second. But in tokio-console this one will be shown as having lost it's waker.

use tokio::net::TcpListener;

#[tokio::main]
async fn main() {
    console_subscriber::init();

    if let Ok(listener) = TcpListener::bind("127.0.0.1:5555").await {
        loop {
            if let Ok((_, _)) = listener.accept().await {
                let handle = tokio::task::spawn(async move {
                    tokio::time::sleep(tokio::time::Duration::from_secs(1)).await;
                    tokio::task::spawn(async move {
                        tokio::time::sleep(tokio::time::Duration::from_secs(600)).await;
                    });
                });

                let _ = handle.await;
                println!("done");
            }
        }
    }
}

Logs, error output, etc

image

Versions

console crates versions:

tokio-console-warning v0.1.0 (/home/cedric/src/rust-test/tokio-console-warning)
├── console-subscriber v0.1.5
│   ├── console-api v0.2.0

tokio-console is built from git, version:

git describe
tokio-console-v0.1.4-5-g472ff52

Possible solution

No response

Additional context

No response

Would you like to work on fixing this bug?

maybe

cedricje avatar May 02 '22 13:05 cedricje

I think the issue is that the task's span is (incorrectly) being kept alive by the child task's span. We'll probably want to change the tokio instrumentation so that task spans have parent: None and instead follow from the current span, so that they don't keep the current task's span alive incorrectly.

hawkw avatar May 04 '22 22:05 hawkw

I think I've encountered this issue. I also want to highlight that the task that has completed (I've .awaited the JoinHandle) and that "has lost its waker" according to tokio-console warning is still being polled (or at least tokio-console says so).

Screenshots

2022-07-04_13-40 2022-07-04_13-40_1

I also don't think that this issue requires using spawn inside of spawn since for me this occurs with just a single spawn deep inside of #[tokio::main] marked function. Though there are a lot of code, so I'm not sure if I'll be able to make a good MRE.

WaffleLapkin avatar Jul 04 '22 10:07 WaffleLapkin

Another option that could be worth considering is giving the task spans some way of explicitly indicating that the task has terminated (e.g. it could record a field value like task.terminated = true). This way, we would know when a task has completed, even when the task's span is kept alive because it has children. Unlike the solution of changing task spans to set parent: None, this would also resolve cases where the task span is kept alive by a child span that is not a task...which is what I suspect is going on in @WaffleLapkin's case:

I also don't think that this issue requires using spawn inside of spawn since for me this occurs with just a single spawn deep inside of #[tokio::main] marked function. Though there are a lot of code, so I'm not sure if I'll be able to make a good MRE.

hawkw avatar Jul 06 '22 18:07 hawkw

I'm sorry for the off-topic discussion but searching for this warning brought me here. I'm new to async in Rust and saw this warning while debugging an issue. Can anyone point me towards a resource for why this warning occurs (I.e. what pattern in async Rust leads to such a scenario and how to fix it)? I'm trying to determine if it's a console bug as indicated in this issue, or a real bug in my own code.

landaire avatar Jul 09 '22 17:07 landaire

I'm seeing this when using actix-web. tokio-console reports lost wakers in:

~/.cargo/registry/src/github.com-1ecc6299db9ec823/actix-rt-2.7.0/src/runtime.rs:80:20 ~/.cargo/registry/src/github.com-1ecc6299db9ec823/actix-rt-2.7.0/src/lib.rs:207:5

which look like regular uses of spawn_local() and local_set.block_on().

I also get one in ~/.cargo/registry/src/github.com-1ecc6299db9ec823/hyper-0.14.23/src/common/exec.rs:49:21

and another reported leak is in tokio itself:

~/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.23.0/src/task/local.rs:586:35

Is this something I should be worried about? My server leaks about 20 wakers almost instantly, and it usually goes up to 50 before it becomes quite unresponsive. I don't know if that's the cause, or a symptom of some other issue, or just something unrelated.

kornelski avatar Dec 22 '22 02:12 kornelski

I'm seeing this when holding on to a WeakSender whose associated Sender was created in the exited task. Here's a minimal reproduction:

use std::sync::{Arc, Mutex};
use tokio::{sync::mpsc, task::JoinSet};

async fn handler(holder: Arc<Mutex<Vec<mpsc::WeakSender<()>>>>) {
  let (handler_tx, _) = mpsc::channel::<()>(1);
  let mut holder = holder.lock().unwrap();
  holder.push(handler_tx.downgrade()); // Push `WeakSender` to Vec.
  drop(holder);
  panic!(); // Not required to reproduce.
}

#[tokio::main(flavor = "current_thread")]
async fn main() {
  console_subscriber::init();
  let mut set = JoinSet::new();
  let holder = Arc::new(Mutex::new(Vec::new()));
  for _ in 0..16 {
    set.spawn(handler(holder.clone()));
  }
  // Properly await all exited tasks.
  while let Some(_) = set.join_next().await {}
  std::future::pending::<()>().await;
}

When run, tokio-console shows "16 tasks have lost their waker", and for each waker shows "This task has lost its waker, and will never be woken again".

I would expect this to leak the 16 WeakSender handles, but given that all tasks were properly awaited, I would not expect any tasks to leak.

traviscross avatar Dec 27 '22 07:12 traviscross

This issue is now fixed in the Tokio instrumentation on master (tokio-rs/tokio#6158). Let's wait until the next Tokio release before we close the issue.

hds avatar Nov 19 '23 18:11 hds

I think this can be closed now, the new tokio version includes the fix

NULLx76 avatar Dec 11 '23 14:12 NULLx76