async-tungstenite icon indicating copy to clipboard operation
async-tungstenite copied to clipboard

Task wakes too often in release mode when used with `async_std`

Open afeistel opened this issue 1 year ago • 14 comments

This may cause a CPU core to be constantly at 100% after a client connects and sends nothing:

#[async_std::main]
async fn main() -> async_tungstenite::tungstenite::Result<()> {
    let listener = async_std::net::TcpListener::bind("127.0.0.1:8080").await?;
    let (stream, _) = listener.accept().await?;
    async_tungstenite::accept_async(stream).await?;
    Ok(())
}

(the same also happens after the handshake when the client is idle; using only accept_async as the example, because shorter)

It could be an issue with async-std but I haven't yet been able to replicate it without async-tungstenite ~~(both codebases are still a bit hard for me to navigate)~~.

[dependencies]
async-std = { version = "1.12.0", features = ["attributes"] }
async-tungstenite = { version = "0.26.1", features = ["async-std-runtime"] }

afeistel avatar Jun 13 '24 15:06 afeistel

If you do the same with tokio, does it behave correctly?

sdroege avatar Jun 14 '24 12:06 sdroege

I've tested it with async-net (versions 1 and 2), and that worked okay; will test with tokio now;

ed.: tokio is fine

afeistel avatar Jun 14 '24 13:06 afeistel

This, to me, looks a lot like some sort of soft race condition, because it goes away if I add some overhead (or run the thing in debug) or do anything that causes poll be called not as soon as it wakes, and happens more often if I add this at the very start of main:

async_std::task::spawn(futures_util::future::pending::<()>());

afeistel avatar Jun 14 '24 13:06 afeistel

Update: this seems to affect anything that uses async_io::Async::<std::net::TcpStream>::poll_read directly. Happens both on versions 1 and 2

afeistel avatar Jun 14 '24 14:06 afeistel

I found the reason (async_tungstenite::compat):

let waker = match kind {
    ContextWaker::Read => task::waker_ref(&self.read_waker_proxy),
    ContextWaker::Write => task::waker_ref(&self.write_waker_proxy),
};

and the "fix" (for now I can't figure out much better):

let waker = match kind {
    ContextWaker::Read => task::waker_ref(&self.read_waker_proxy),
    ContextWaker::Write => task::waker_ref(&self.write_waker_proxy),
}
.clone();

why:

  • Waker returned by task::waker_ref changes its vtable on clone
  • async-io's Source, just like AtomicWaker, checks Waker equality by Waker::will_wake which compares both the pointer and the vtable
  • async-io's Source wakes the older Waker on change

as for why async-net works: I have no clue

afeistel avatar Jun 14 '24 18:06 afeistel

Okay, so I've added a print:

let waker = match kind {
    ContextWaker::Read => task::waker_ref(&self.read_waker_proxy),
    ContextWaker::Write => task::waker_ref(&self.write_waker_proxy),
};
eprintln!("{waker:?} {:?}", waker.clone());

In debug builds (some parts omitted):

... data: 0x563a8f431fa0, vtable: 0x563a8e701368 ... data: 0x563a8f431fa0, vtable: 0x563a8e701368 ...

In release builds on 1.78/1.79:

... data: 0x563ac7b8cf30, vtable: 0x563ac5b80238 ... data: 0x563ac7b8cf30, vtable: 0x563ac5b80178 ...

???????

compiler bug?

Seems like it's making the same vtable be at different addresses.

ed.: it was indeed a compiler bug: https://github.com/rust-lang/futures-rs/pull/2829#issuecomment-1962863389

afeistel avatar Jun 14 '24 18:06 afeistel

There's been a change in how Waker works. std::task::Wake has adapted to it and futures::task::ArcWake has not. I've checked with both std::task::Wake and with a patched version of futures::task::ArcWake, both work fine. So, ig, if there's a need to mitigate this without waiting for futures-task to be changed, switching over std::task::Wake is a possibility at the cost of some extra atomic operations (std::task doesn't provide an equivalent to waker_ref, as far as I see).

afeistel avatar Jun 14 '24 22:06 afeistel

I don't have time to look into this in more detail until some time next week, but this sounds like a bug in the code doing the comparison. If multiple codegen units are involved, it's not guaranteed that the vtable is always the same for the same object.

That's why e.g. https://doc.rust-lang.org/stable/std/sync/struct.Arc.html#method.ptr_eq exists.

sdroege avatar Jun 15 '24 16:06 sdroege

I'll list all the parts of code I found relevant so far in one place, ~~and with less guesses~~

async-tungstenite creating wakers that change their vtable on clone: https://github.com/sdroege/async-tungstenite/blob/6ff28b380c55b59a6ca96a6d76528754fa7b50ff/src/compat.rs#L128-L129

async-std directly uses async_io::Async: https://github.com/async-rs/async-std/blob/8947d8e03c4c267fdd3828e07368cefc7d39b002/src/net/tcp/stream.rs#L5 https://github.com/async-rs/async-std/blob/8947d8e03c4c267fdd3828e07368cefc7d39b002/src/net/tcp/stream.rs#L50 https://github.com/async-rs/async-std/blob/8947d8e03c4c267fdd3828e07368cefc7d39b002/src/net/tcp/stream.rs#L308

async-io checking Wakers for equality and waking the old one on change: https://github.com/smol-rs/async-io/blob/master/src/reactor.rs#L456-L461

futures-task creates vtables for original and cloned (note no #[inline(always)] on clone_arc_raw): https://github.com/rust-lang/futures-rs/blob/master/futures-task/src/waker_ref.rs#L64 https://github.com/rust-lang/futures-rs/blob/master/futures-task/src/waker.rs#L38-L42

Patch to alloc::task in Rust that uses #[inline(always)] to avoid the split across CGUs: https://github.com/rust-lang/rust/pull/121622

async-tungstenite can mitigate this by using std::task::Wake instead of futures_task::ArcWake or cloning the waker, both options involve extra Arc clones to work (&Arc<impl std::task::Wake> can't be directly converted into impl Deref<Target = Waker> like futures_task::waker_ref does).

futures-task can mitigate this by putting #[inline(always)] on clone_arc_raw.

afeistel avatar Jun 15 '24 17:06 afeistel

futures-task can mitigate this by putting #[inline(always)] on clone_arc_raw.

That seems like a solution that will just fail again some time in the future. This sounds to me like will_wake() should actually only check the pointer for equality and not the vtable.

async-tungstenite can mitigate this by using std::task::Wake instead of futures_task::ArcWake or cloning the waker

Doing this as a workaround for now seems acceptable. Do you want to provide a PR for that?

sdroege avatar Jun 17 '24 05:06 sdroege

Intentionally keeping this open here so there's a reminder to revert the change once that's safe to do again.

sdroege avatar Jun 17 '24 07:06 sdroege

https://github.com/rust-lang/futures-rs/commit/b92f4c55a3f8e9ffd2201b7a2c1827343339c92e

afeistel avatar Sep 16 '24 16:09 afeistel

Tested with futures 0.3.31 and async-tungstenite =0.26.1, the problem does not seem to appear, unlike with futures =0.3.30 and async-tungstenite =0.26.1.

afeistel avatar Oct 06 '24 21:10 afeistel

Thanks, I'll update the minimum version for futures with the next release and drop the workaround

sdroege avatar Oct 14 '24 10:10 sdroege