hxcpp icon indicating copy to clipboard operation
hxcpp copied to clipboard

Rare startup crash when using multiple threads

Open RobDangerous opened this issue 5 years ago • 7 comments

Since a long time I experience a very rare startup crash in GCFreeZone (see also my confusion in #758). When it happens, it only happens rarely, preferably in release builds and when it started happening more often a small sleep or shifting things around helped. But tonight I could create a situation which caused the issue very reliably, I tried to use that to finally find the cause... I'm still not 100% sure but I think the problem is caused by setting gMultiThreadMode to true at the exact wrong moment, namely after entering and before leaving a gc-free-zone. I fixed it for Kha by just setting it to true from the start (Kha always runs mutltithreaded anyway) and so far things seem to be fine for me now.

RobDangerous avatar Jul 21 '19 23:07 RobDangerous

Yes, this if very plausible. I have not properly analysed the transition of this variable. How is it that the second thread attaches? is it from foreign code, or is it from haxe code? ie, could you reproduce this with just haxe code, or do you need a system/ui thread attaching?

hughsando avatar Jul 23 '19 01:07 hughsando

It's from foreign code, this code in particular: https://github.com/Kode/Kha/blob/master/Backends/Kore/main.cpp#L238

RobDangerous avatar Jul 23 '19 16:07 RobDangerous

Yes, that makes sense - probably have not tested that path too much.

There might be an issue with that thread staying registered after the mixing has stopped, or if the thread is paused and not making any GC calls for a while. An other thread may need to do a GC, and end up waiting for this thread to "check in". You could de-register the thread, or keep it in a "gc free zone" as long as possible. If '_callCallback' is the haxe code, you would exit the zone, make this call, and enter the zone.

Also, technically, if the call-stack depth changes when mix is called (it probably won't) the top-of-stack might be funky with only the singleton SetTop call. In this case, it would be better to attach the thread, make the _callCallback call, then detach the thread.

hughsando avatar Jul 24 '19 01:07 hughsando

Yes, I noticed that exact problem. The mixing never stops but the thread doesn't do any allocations after it started up and the only thing calling back into the GC system is the mutex locks. When I replaced them with plain system locks (because I don't want the audio thread to be stopped for GC collection) exactly that happened. Is that a general problem with allocation-free threads? Checked the callstack depth though (you can still see the code for that some lines below) and luckily it stays the same on all systems I tested.

RobDangerous avatar Jul 24 '19 08:07 RobDangerous

The GC requires co-operation of all the attached threads. The system checks to see if it needs to do anything as part of the allocation call, but if you are not going to do allocations (eg, tight, long, numerical loop) you can explicitly call __hxcpp_gc_safe_point to be a good citizen.

For audio in particular, other setups may be possible where you completely separate haxe and non-haxe threads via some kind of shared queue. You could start an "audio haxe thread" from haxe which calls to native code where it enters the gc free zone, and blocks on a native "audio needed" event. The haxe thread then exists the gc free zone and uses custom code to fill a buffer (haxe buffer or native mapped buffer depending on how you are setup) and then posts it to the native audio thread, and the procedure continues by waiting on the event again. The native audio thread does the mixing and sets the "audio needed" event when one buffer is empty. You could poll the "audio needed" flag each frame if you are sure the frame rate is high enough, and you need "main thread events only" like flash.

If you are sure you are not going to make any GC calls, then you might be able to get away with not attaching - but I have not tried this.

hughsando avatar Jul 24 '19 08:07 hughsando

I'll need to keep it the way it is to minimize audio latency and possible stutter and I can't possibly be sure about the framerate. I call into the mixer (which is Haxe code) directly from the system's audio thread. My own mixer doesn't do any allocations after the first few calls but the mixer can be changed by the user. I'll give it a try to detach it when it stops allocating but sadly that's not the right thing to do for every situation - will add gcfree zones for the general case.

RobDangerous avatar Jul 24 '19 11:07 RobDangerous

Hey, I just merged and saw that you also fixed this. I also got my audio threads in order - audio callback can now notify the system when it stops doing allocations and that detaches the thread. Glitch free audio in all situations while still allowing custom audio-mixing. I think we're done here.

RobDangerous avatar Aug 14 '19 21:08 RobDangerous