WSL icon indicating copy to clipboard operation
WSL copied to clipboard

WSL Distro terminated abruptly

Open VaanaraRaaja opened this issue 6 months ago • 13 comments

Windows Version

Windows 11

WSL Version

2.4.13.0

Are you using WSL 1 or WSL 2?

  • [x] WSL 2
  • [ ] WSL 1

Kernel Version

5.15.167.4-1

Distro Version

Ubuntu-22.04

Other Software

Docker version 28.0.4, build b8034c0

Repro Steps

Downgraded WSL due the earlier issues with WSL crashing. Continue to have crashes.

Diagnostics ID - BCA194C1-5C48-4AEC-9783-A2D362714370/20250604171052

Image

Expected Behavior

No crash

Actual Behavior

Happens often

Diagnostic Logs

Diagnostics ID - BCA194C1-5C48-4AEC-9783-A2D362714370/20250604171052

Image

VaanaraRaaja avatar Jun 04 '25 17:06 VaanaraRaaja

Logs are required for review from WSL team

If this a feature request, please reply with '/feature'. If this is a question, reply with '/question'. Otherwise please attach logs by following the instructions below, your issue will not be reviewed unless they are added. These logs will help us understand what is going on in your machine.

How to collect WSL logs

Download and execute collect-wsl-logs.ps1 in an administrative powershell prompt:

Invoke-WebRequest -UseBasicParsing "https://raw.githubusercontent.com/microsoft/WSL/master/diagnostics/collect-wsl-logs.ps1" -OutFile collect-wsl-logs.ps1
Set-ExecutionPolicy Bypass -Scope Process -Force
.\collect-wsl-logs.ps1

The script will output the path of the log file once done.

If this is a networking issue, please use collect-networking-logs.ps1, following the instructions here

Once completed please upload the output files to this Github issue.

Click here for more info on logging If you choose to email these logs instead of attaching to the bug, please send them to [email protected] with the number of the github issue in the subject, and in the message a link to your comment in the github issue and reply with '/emailed-logs'.

github-actions[bot] avatar Jun 04 '25 17:06 github-actions[bot]

Diagnostics ID - BCA194C1-5C48-4AEC-9783-A2D362714370/20250604172512

WslLogs-2025-06-04_13-24-22.zip

VaanaraRaaja avatar Jun 04 '25 17:06 VaanaraRaaja

Diagnostic information
Detected appx version: 2.4.13.0
Detected user visible error: Wsl/Service/AttachDisk/MountVhd/WSL_E_USER_VHD_ALREADY_ATTACHED

github-actions[bot] avatar Jun 04 '25 17:06 github-actions[bot]

Looking at the logs, it looks like the distribution VHDX was already in use by someone else. Did you use wsl --mount on that disk, or did another program access that VHDX file ?

OneBlue avatar Jun 04 '25 23:06 OneBlue

I lost all my containers and images when the issues started two days back (and since then you asked me to downgrade). So, rebuilding now.

I was running docker compose on wsl. it failed. i did not do the windows logs then, but provided the first diagnostics id in this first comment. then to get logs ran the ps1, and started running again and got this error almost immediately.

running log collection at start of session is iffy, since the crashes do not have a pattern. it sometimes happens immediately other times after 20 minutes.

very simple docker compose works - do not have issues.

The difference between the 2.5.x and 2.4.x - both fail. 2.5.x freezes / hangs wsl. 2.4.x allows termination with "enter" on shell

VaanaraRaaja avatar Jun 04 '25 23:06 VaanaraRaaja

Interesting. Sadly the docker diagnostic ID don't help me since I don't have access to those. To root cause this we'd need logs from a repro so we can see why the distribution is stopping.

I'll leave the issue open so you can publish logs if you managed to capture a repro under the log collection script

/logs

OneBlue avatar Jun 04 '25 23:06 OneBlue

went to starting dual boot due to frustrating past one week and losing all work and continue to have issues.

I will reinstall wsl, docker again and give it another try to capture logs. i would prefer to get wsl working without issues, so a few extra hours are worth it to capture logs

VaanaraRaaja avatar Jun 04 '25 23:06 VaanaraRaaja

Diagnostic information
Detected appx version: 2.4.13.0
Detected user visible error: Wsl/Service/CreateInstance/E_UNEXPECTED
Detected user visible error: Wsl/Service/CreateInstance/E_UNEXPECTED
Detected user visible error: Wsl/Service/CreateInstance/E_UNEXPECTED
Detected user visible error: Wsl/Service/CreateInstance/E_UNEXPECTED

github-actions[bot] avatar Jun 05 '25 01:06 github-actions[bot]

I realized that the docker builds were consistently failing (crashing wsl) during sageattention and / or flashattention setups. So, I removed those. Also, added .wslconfig allowing use of larger memory and procs since i had ryzen 9 with 64 gb. with this setup at least the first build went successfully. Not sure if that is the actual capture in logs, and if so the errors could be clearer. Or it could be something totally different since I had eliminated those from setups now.

VaanaraRaaja avatar Jun 05 '25 02:06 VaanaraRaaja

Thank you @VaanaraRaaja. Looking at the logs, I can see the issue still seems to indeed be memory:

53986	False	Microsoft.Windows.Lxss.Manager	GuestLog	0	06-04-2025 18:44:02.392	"	"	"text: 	""[  670.874628] Out of memory: Killed process 416 (networkd-dispat) total-vm:30072kB, anon-rss:0kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:100kB oom_score_adj:0 ""
vmId: 	{e3265a3a-d731-476f-ac11-db3ec5cb6ddf}"				4928	12276	5		00000000-0000-0000-0000-000000000000		

Does giving more memory solve the issue here ? If so that would explain the errors you were seeing.

OneBlue avatar Jun 10 '25 21:06 OneBlue

Nope, did not help. I am now trying to install from source after i built the docker. Since it was too frustrating running the docker full build again and again.

VaanaraRaaja avatar Jun 11 '25 21:06 VaanaraRaaja

This happens to me all the time. I've reproduced the problem, a log is attached.

WslLogs-2025-06-13_09-28-25.zip

Just as a fairly blind guess as to what is relevant - looking at it in Perfview, I can see that under the Microsoft.Windows.Subsystem.Lxss/LxssException category there is an entry when the problem happens as follows

ThreadID	19,944
ProcessorNumber	3
File	C:\__w\1\s\src\windows\common\svccomm.cpp
FunctionName	
Linenumber	729
Type	0
HRESULT	-2,147,418,113
Message	
Code	

My reproduction steps:

  1. VSCode running, in a state where it is disconnected from the container, trying to reconnect, failing
  2. Open a WSL shell in powershell, run a script which runs a number of containers in docker-compose, including running webpack builds which open/watch a large number of files.
  3. Wait a few minutes, it seems to crash reliably with Service/E_UNEXPECTED

hedgepigdaniel avatar Jun 12 '25 23:06 hedgepigdaniel

Nope, did not help. I am now trying to install from source after i built the docker. Since it was too frustrating running the docker full build again and again.

Ok I see. Swap might be what you're missing if you need a lot of memory then. Can you try to add this to .wslconfig ?

[wsl2]
swap=32GB

This should help resolve the memory limitations at least. Please share /logs if you hit another issue

OneBlue avatar Oct 17 '25 19:10 OneBlue

@OneBlue thanks for the tip - I've run with 32GB of swap for a few days - I think that has actually solved the problem for me! I was getting many crashes per day, and have had 0 since I did that.

I wonder if there's an underlying problem that leads to such a symptom? On a real machine running my typical Linux setup, running out of memory (in my experience) leads to hanging/freezing - but not crashes. I wonder why WSL crashes completely? And I wonder if there something that can be done to make that issue more discoverable? The error E_UNEXPECTED is not useful at all - and clearly alot of people have lost alot of time as a result.

hedgepigdaniel avatar Oct 22 '25 00:10 hedgepigdaniel

Well the VM freezing would cause timeouts and closed sockets from the Windows' perspective, which would cause those errors (especially if the OOM triggers).

Closing since enabling swap solved the issue.

OneBlue avatar Oct 23 '25 22:10 OneBlue