js-libp2p
js-libp2p copied to clipboard
Yamux pontentially sending a WindowUpdate after it has closed the stream
-
Version: js-v0_45
-
Platform: ubuntu-22.04
-
Subsystem: Yamux., used with ws and noise.
Severity: High if it is actually an issue on JS side.
Description:
-
What you did The ping test sometimes fails when js is the dialer and nim-libp2p the listener. https://github.com/vacp2p/nim-libp2p/actions/runs/8815571518/job/24199161935?pr=1086
-
What happened Nim is receiving a
{WindowUpdate, {}, streamId: 1, length: 127}after it has received a{WindowUpdate, {Fin}, streamId: 1, length: 0}. Nim sent{Data, {Fin}, streamId: 1, length: 0}before receiving the messages above, so it believes the stream has been fully closed from both sides and doesn't accept the lastWindowUpdate. It is possible that JS still hasn't received the Nim close msg. The Yamux spec says "To close a stream, either side sends a data or window update frame along with the FIN flag. This does a half-close indicating the sender will send no further data.". Does it includeWindowUpdate?
These are the logs filtered by streamId: 1:
2024-04-24T11:58:31.0169041Z listener-1 | TRC 2024-04-24 11:58:25.717+00:00 got message topics="libp2p yamux" tid=7 h="{WindowUpdate, {Syn}, streamId: 1, length: 0}"
2024-04-24T11:58:31.0176726Z listener-1 | TRC 2024-04-24 11:58:25.717+00:00 write directly on stream topics="libp2p yamux" tid=7 h="{WindowUpdate, {Ack}, streamId: 1, length: 0}"
2024-04-24T11:58:31.0239117Z listener-1 | TRC 2024-04-24 11:58:25.726+00:00 got message topics="libp2p yamux" tid=7 h="{Data, {}, streamId: 1, length: 36}"
2024-04-24T11:58:31.0633568Z listener-1 | TRC 2024-04-24 11:58:25.771+00:00 try to send the buffer topics="libp2p yamux" tid=7 h="{Data, {}, streamId: 1, length: 20}"
2024-04-24T11:58:31.0641794Z listener-1 | TRC 2024-04-24 11:58:25.772+00:00 try to send the buffer topics="libp2p yamux" tid=7 h="{Data, {}, streamId: 1, length: 16}"
2024-04-24T11:58:31.0676154Z listener-1 | TRC 2024-04-24 11:58:25.775+00:00 try to send the buffer topics="libp2p yamux" tid=7 h="{Data, {}, streamId: 1, length: 127}"
2024-04-24T11:58:31.0683902Z listener-1 | TRC 2024-04-24 11:58:25.775+00:00 write directly on stream topics="libp2p yamux" tid=7 h="{Data, {Fin}, streamId: 1, length: 0}"
2024-04-24T11:58:31.1071982Z listener-1 | TRC 2024-04-24 11:58:25.808+00:00 got message topics="libp2p yamux" tid=7 h="{WindowUpdate, {}, streamId: 1, length: 20}"
2024-04-24T11:58:31.1112419Z listener-1 | TRC 2024-04-24 11:58:25.811+00:00 got message topics="libp2p yamux" tid=7 h="{WindowUpdate, {}, streamId: 1, length: 16}"
2024-04-24T11:58:31.1194427Z listener-1 | TRC 2024-04-24 11:58:25.818+00:00 got message topics="libp2p yamux" tid=7 h="{WindowUpdate, {Fin}, streamId: 1, length: 0}"
2024-04-24T11:58:31.1313826Z listener-1 | TRC 2024-04-24 11:58:25.831+00:00 got message topics="libp2p yamux" tid=7 h="{WindowUpdate, {}, streamId: 1, length: 127}"
- What you expected to happen Not sure yet.
Steps to reproduce the error:
The test is flaky, but running it sometimes reproduces the situation described.
Hey @achingbrain! Would you be able to help validating this? We at js-waku are facing troubles upgrading from mplex to yamux because of an apparent incompatibility against nim-libp2p over Yamux
Perhaps this issue needs to be ported to https://github.com/ChainSafe/js-libp2p-yamux?
@danisharora099 yes, that's the right place for this issue. Unfortunately I don't think GitHub lets us transfer issues between orgs, only between repos in the same org so it'll have to be recreated there..
It might be enough to update the stream state in YamuxStream.sendCloseWrite to be StreamState.Finished, then check this.state is not StreamState.Finished in YamuxStream.sendWindowUpdate before sending the update.
What do you think @wemeetagain?
Opened an issue on js-libp2p-yamux as well: https://github.com/ChainSafe/js-libp2p-yamux/issues/91
cc @achingbrain @wemeetagain
What do you think @wemeetagain?
Yes this sound sensible
Digging into this a bit:
The problem is we're unilaterally sending a window update on consumption of each source chunk. Definitely missing a conditional to stop doing that. I think we can check that the read state is not closed or closing.