Git update sometimes crashes after #585
Occasionally when running haxelib update for a git repository (on Windows), there is a crash.
Crash logs
$ haxelib update hxcpp --debug
[debug] Using haxelib from "..."
# Running command: git [diff,--exit-code,--no-ext-diff]
# Exited with code 0
# Running command: git [diff,--cached,--exit-code,--no-ext-diff]
# Exited with code 0
# Running command: git [fetch]
# Exited with code 0
# Running command: git [rev-parse,@{u}]
3f6de84d4decb0a7aa1131ebd527f623b1e2d2b1
# Exited with code 0
# Running command: git [rev-parse,HEAD]
An exception occurred in a neko Thread :
std@lock_release
An exception occurred in a neko Thread :
std@lock_release
Called from ? line 1
Called from haxelib/Util.hx line 14
Called from haxelib/client/Main.hx line 895
Called from haxelib/Util.hx line 14
Called from haxelib/client/Main.hx line 268
Called from haxelib/client/Main.hx line 583
Called from haxelib/api/Installer.hx line 372
Called from haxelib/api/Installer.hx line 373
Called from haxelib/api/Installer.hx line 426
Called from haxelib/Util.hx line 14
Called from haxelib/api/Installer.hx line 826
Called from haxelib/api/Vcs.hx line 288
Called from haxelib/api/Vcs.hx line 167
Called from haxelib/api/Vcs.hx line 206
Called from C:\HaxeToolkit\haxe\std/neko/_std/sys/io/Process.hx line 108
Uncaught exception - std@process_exit
$ haxelib update hxcpp --debug
[debug] Using haxelib from "..."
# Running command: git [diff,--exit-code,--no-ext-diff]
# Exited with code 0
# Running command: git [diff,--cached,--exit-code,--no-ext-diff]
# Exited with code 0
# Running command: git [fetch]
# Exited with code 0
# Running command: git [rev-parse,@{u}]
3f6de84d4decb0a7aa1131ebd527f623b1e2d2b1
# Exited with code 0
# Running command: git [rev-parse,HEAD]
An exception occurred in a neko Thread :
std@lock_release
An exception occurred in a neko Thread :
std@lock_release
Called from ? line 1
Called from haxelib/Util.hx line 14
Called from haxelib/client/Main.hx line 895
Called from haxelib/Util.hx line 14
Called from haxelib/client/Main.hx line 268
Called from haxelib/client/Main.hx line 583
Called from haxelib/api/Installer.hx line 372
Called from haxelib/api/Installer.hx line 373
Called from haxelib/api/Installer.hx line 426
Called from haxelib/Util.hx line 14
Called from haxelib/api/Installer.hx line 826
Called from haxelib/api/Vcs.hx line 288
Called from haxelib/api/Vcs.hx line 167
Called from haxelib/api/Vcs.hx line 209
Called from C:\HaxeToolkit\haxe\std/neko/_std/sys/thread/Lock.hx line 34
Uncaught exception - std@lock_wait
$ haxelib update hxcpp --debug
[debug] Using haxelib from "..."
# Running command: git [diff,--exit-code,--no-ext-diff]
# Exited with code 0
# Running command: git [diff,--cached,--exit-code,--no-ext-diff]
# Exited with code 0
# Running command: git [fetch]
# Exited with code 0
# Running command: git [rev-parse,@{u}]
3f6de84d4decb0a7aa1131ebd527f623b1e2d2b1
# Exited with code 0
# Running command: git [rev-parse,HEAD]
Called from ? line 1
Called from haxelib/Util.hx line 14
Called from haxelib/client/Main.hx line 895
Called from haxelib/Util.hx line 14
Called from haxelib/client/Main.hx line 268
Called from haxelib/client/Main.hx line 583
Called from haxelib/api/Installer.hx line 372
Called from haxelib/api/Installer.hx line 373
Called from haxelib/api/Installer.hx line 426
Called from haxelib/Util.hx line 14
Called from haxelib/api/Installer.hx line 826
Called from haxelib/api/Vcs.hx line 288
Called from haxelib/api/Vcs.hx line 167
Called from haxelib/api/Vcs.hx line 206
Called from C:\HaxeToolkit\haxe\std/neko/_std/sys/io/Process.hx line 108
Uncaught exception - std@process_exit
Other times everything is fine:
haxelib update hxcpp --debug
[debug] Using haxelib from "..."
# Running command: git [diff,--exit-code,--no-ext-diff]
# Exited with code 0
# Running command: git [diff,--cached,--exit-code,--no-ext-diff]
# Exited with code 0
# Running command: git [fetch]
# Exited with code 0
# Running command: git [rev-parse,@{u}]
3f6de84d4decb0a7aa1131ebd527f623b1e2d2b1
# Exited with code 0
# Running command: git [rev-parse,HEAD]
3f6de84d4decb0a7aa1131ebd527f623b1e2d2b1
# Exited with code 0
Library hxcpp git repository is already up to date
Seems to only happen during the git rev-parse HEAD command. Does not seem to happen on 4.0.3, only on the development branch, which makes sense since 4.0.3 does not use git rev-parse HEAD.
This may be related to https://github.com/HaxeFoundation/neko/issues/281, since both started happening when the threads were added in #585.
I've also been able to reproduce this with hashlink (and previously with hxcpp too). This makes sense as their code is quite similar.
It seems that when attempting to create a new lock, CreateSemaphore is somehow returning a handle to an existing semaphore already linked to another lock. This happens near the time when the previous lock should be garbage collected (and CloseHandle should run). This results in the new lock being created with an invalid semaphore handle.
It could be a race condition where CloseHandle and CreateSemaphore are somehow interleaved so CreateSemaphore returns what it thinks is an unused handle, only for it to be subsequently marked invalid by CloseHandle. I haven't found definitive information about whether or not these are thread-safe.
This happens because of the following situation:
- Process stdin handle A closed at p.stdin.close(), releasing handle A
- Lock 1 (for git invocation 1) receives handle A
- Process stdin handle A is closed again at p.close(), releasing handle A again
- Lock 2 (for git invocation 2) also receives handle A
- GC RUN: Lock 1 is closed, releasing handle A again
- Lock 2 tries to run something, but handle A is now invalid
So, it turns out https://github.com/HaxeFoundation/haxelib/pull/642 would have actually dodged the issue... 😅
The bug can be reproduced more simply with this sample:
function main() {
final p = new sys.io.Process("cmd", ["/c", "echo hello world"]);
p.stdin.close();
final streamsLock = new sys.thread.Lock(); // sandwiched between p.stdin.close() and p.close()
p.close();
streamsLock.release();
streamsLock.wait();
}
The bug will be fixed by patching neko and hashlink, see: https://github.com/HaxeFoundation/neko/pull/300 https://github.com/HaxeFoundation/hashlink/pull/743
Hxcpp seems to handle this properly, but in my testing I found that some windows api errors are being ignored completely, so that will require further investigation.
The other issue, https://github.com/HaxeFoundation/neko/issues/281, remains however, it is not related to this problem and it seems to be an architectural problem with neko.