Blocking I/O behavior in localhost relay hangs with full duplex traffic due to half duplex forwarding logic.
Problem Description
The implementation of localhost relaying on the Linux portion is using blocking I/O in a single thread to copy both sides of a connection. While this will work with a half-duplex request-reply style system (e.g. http/1.1), it will lead to connection hangs when there is a multiplexed protocol and bidirectional traffic (e.g. ssh, http/2, etc).
As an example using the reproducer linked later in this report, the following strace output shows a thread in the localhost forwarding process (init ran under the name localhost in the host CBL OS) hung while writing to its destination VSOCK connection (FD=6). This blocks forever since the other end of the connection (wslrelay.exe + the client win application) has a full read buffer, and the protocol can not advance until the WSL init process reads from the VSOCK fd (opening up more buffer space between the two ends).
[pid 778] ppoll([{fd=6, events=POLLIN}, {fd=9, events=POLLIN}], 2, NULL, NULL, 8) = 2 ([{fd=6, revents=POLLIN}, {fd=9, revents=POLLIN}])
[pid 778] read(6, "table language.\n\nThe most import"..., 131072) = 16384
[pid 778] write(9, "table language.\n\nThe most import"..., 16384) = 16384
[pid 778] read(9, "suitable language.\n\nThe most imp"..., 131072) = 131072
[pid 778] write(6, "suitable language.\n\nThe most imp"..., 131072 <unfinished ...> <----- Hang
While the Windows side (wslrelay.exe) is using Overlapped I/O, it simulates a similar write-blocking pattern. Relay.cpp polls reads on both the VSOCK and TCP win app connection using WaitForMultipleObjects; however, the writing portion on both ends calls an Overlapped WriteFile immediately followed by a WaitForMultipleObjects on that descriptor, blocking until the write fully completes.
Suggested Fix
Oh the Linux side, I would recommend switching the socket to non-blocking, and introducing a buffer for each direction of the connection, which is managed by a simple state machine. The limit for each direction would need to be reduced by the leftover amount of an incomplete write (essentially pre-filling the read buffer) to establish back-pressure to and allow SOCK_STREAM flow-control to throttle an abusive peer, as well as limit resource consumption.
Alternatively, if you prefer to stay with blocking strategies, you could go with a two-thread per-connection approach, where each thread copies in opposite directions of the connection. In this model, you would rely on cooperative write shutdown triggering graceful close of the connection.
On the Windows side, in addition to mirroring the above strategies, you could alter the existing Overlapped I/O model to have a singular event handler that covers writes as well as reads, although you would still want to ensure some form of back-pressure is established (limiting buffer / pending i/o) to prevent runaway event queuing.
Symptoms
This problem can often be spotted by observing persistent non-zero Send/Receive socket queues on the Linux host (indicates the application is not consuming buffer):
wsl --debug-shell
# ss -n --tcp
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
ESTAB 418552 0 127.0.0.1:57366 127.0.0.1:9191
ESTAB 0 0 127.0.0.1:9191 127.0.0.1:57366
Reproducer
Since the existing localhost relay implementation is all local with low latency and generous buffers, hang scenarios are timing-sensitive. To make it easier to recreate, I created a stress scenario of a simple synthetic multiplexed protocol here: https://github.com/n1hility/duplex-stress/ (Under AL2 license, but let me know if you prefer under any other terms)
In my testing it triggers a hang condition pretty quickly (tested on multiple environments including 10, 11, Windows ARM and x64). It's written in Golang for Win/Linux portability and to avoid any runtime requirements.
To recreate:
- Install Golang 1.21+, Build it:
PS> .\build.ps1
- Run the server in WSL:
PS> wsl
$ ./duplex-stress server 0.0.0.0 9191
- Run the client from Windows
PS> .\duplex-stress client 127.0.0.1 9191
-
Observe hang (text will stop printing, socket will be stuck as described above)
-
Ctrl-C both and restart using direct TCP
-
Start new server on Linux with a different port (9191 is now dead from hang)
$ ./duplex-stress server 0.0.0.0 9192
- Connect client on Windows using IP addr of WSL dist
PS> .\duplex-stress client $(wsl hostname -I).Trim() 9192
- Observe works
Versions
Observed reproducer with 2.0.5 and 1.2.5
FYI @benhillis @craigloewen-msft , this one is likely frequently encountered
@benhillis @craigloewen-msft BTW I'm not sure if you guys have thought about open sourcing any of the components of WSL, now that its more decoupled from the Windows stream, but just so you are aware for the future, I would be totally down with sending you guys patches for issues like this.
Thank-you for the exceptionally well written bug!
Thank you n1hility for taking the time to diagnose and write up this issue.
This problem is preventing podman from creating container images, which greatly limits podman's use. I've installed version 2.2.2 of WSL and am still unable to create images. Is there any update on this issue?
I figured I'd give VS Code Dev Containers a shot again and was able to get it working on my Intel MacBook Pro. In my excitement to be able to use this feature w/ Podman, I wanted to try it on my Windows machine too. To my disappointment, however, it gets stuck on sending tarball.
This issue pretty much means that using Podman w/ VS Code Dev Containers is 💯 a no go on Windows with WSL until this networking issue is addressed. If there's anyway this could get some kind of kick into gear it would be appreciated! If was as familiar with it as @n1hility, I'd offer to assist as well. Heck I'd be willing to QA it on my machine if that's an option!
Possible workarounds for the time being: Assuming that this is because of the localhost relay, if you try mirrored mode you can bypass the localhost relay, the other option would be to use the address wsl has on the virtual switch it shares with the host.
Mirror mode works. Thank you so much for the suggestion.
I should mention that mirror mode networking allowed me to create an image using podman but then failed to run it emulating ARM64. Docker's licensing terms make using it not really an option. With this problem I can't create images with Podman. Is there any tool that does allow creating and running container images on Windows or an alternative workaround?
Is there a timeline for a fix, or any alternative workarounds available?
Unfortunately, enabling mirrored mode isn't an option for me, as it causes issues with Podman:
Starting machine "podman-machine-default"
your 131072x1 screen size is bogus. expect trouble
API forwarding for Docker API clients is not available due to the following startup failures.
could not start api proxy since expected pipe is not available: podman-machine-default
Podman clients are still able to connect.
Error: machine did not transition into running state: ssh error: machine is not listening on ssh port
Command execution failed with exit code 125
Based on this https://github.com/containers/podman/issues/22975 mirrored mode isn't officially supported by podman so might work for some people but not others.. it didn't work on:
Windows 11
podman version 5.5.2
WSL version: 2.5.9.0
Docker version 28.3.2
Are there any plans to fix this?