haxe icon indicating copy to clipboard operation
haxe copied to clipboard

Haxe compiler hangs nondeterministically - "SuspendThread failed"

Open c-g-dev opened this issue 2 years ago • 1 comments

Almost half the time I run a haxe build, the process hangs and I have to force quit the terminal.

Even something simple like:

-cp src -hl build.hl --main Main

and running

haxe compile.hxml

Sometimes it freezes the terminal, sometimes it causes the following popup with the text "Fatal error in GC: SuspendThread failed".

suspendthreadfailed

Then when you close the popup the following message is printed:

ERROR: Process exited with code 1 Error: Failed to call haxelib (command not found ?)

Sometimes it doesn't freeze the terminal and the process can be ctrl+c exited with the following error message:

haxe compile.hxml Fatal error: exception Stdlib.Sys.Break

This happens across all projects inexplicably and has been happening for months. I'm running Windows 10 and both Haxe and Haxelib are updated to most recent versions.

One thing to note is that I often have 2 or 3 instances of VSCode open, working on different haxe projects. Opening task manager I will see many instances of haxe.exe and haxelib.exe running:

haxetaskmanage

So I don't know if there might be some failing race conditions between the multiple open processes or something.

c-g-dev avatar Nov 11 '23 18:11 c-g-dev

FYI this seems similar to the following issue raised for the neko compiler 15 years ago:

https://www.mail-archive.com/[email protected]/msg01934.html

"I have been using haxe & neko on my quadcore WinXPProSP2 PC for a couple of years now. I always occasionally got a "Fatal error in gc / SuspendThread failed" dialog window when running a neko executable (incl. the haxe compiler). The haxe compile command reports "Error : Neko compilation failure".

However, since upgrading to neko 1.8, it seems to have a lot worse. I would say that for every 5 times I invoke the haxe compiler, it successfully completes only once .. the other times I get the Fatal error message. If I try the simple "neko test" command, I get approx 1 failure for every 6 or 7 invocations (although the console reports "Test successful" when the fatal error message is shown"). I have disabled my virus checker (eset NOD32), then removed the virus checker completely, with no effect."

But I'm not running neko for any of my stack. Not sure if the haxe compiler or haxelib is using neko for this. But this is an ancient issue which was assumedly fixed over a decade ago for neko.

This also doesn't explain the accompanying process hanging whenever the haxe compiler runs. These issues only happen when running the haxe compiler or running a haxelib script (which internally calls Sys.command(haxe Run.hx)). There could be some threading deadlock happening in the eval context, which seems to handle threads in a weird way.

EDIT: After checking the Windows Resource Monitor the wait chain appears to be haxe compiler -> haxelib.exe -> neko.exe, and neko.exe is not returning. So there is definitely some issue with neko, or at least the way that haxelib calls neko.

c-g-dev avatar Dec 17 '23 05:12 c-g-dev