hxcpp icon indicating copy to clipboard operation
hxcpp copied to clipboard

(Generational?) GC crash in processMarkStack

Open RobDangerous opened this issue 5 years ago • 7 comments

This is a nasty one, the game crashes in processMarkStack in the obj->__Mark(this) call when it tries to access inPtr in MarkAlloc but it's super rare, needs a few hours of play on average. If I remember correctly I also saw it in a debug build but even verifying just that is hard. I'll keep looking but so far I have no clever ideas how to narrow this down.

RobDangerous avatar Nov 13 '19 13:11 RobDangerous

@RblSb just told me that he gets a few crash-reports like that in his Android app which he released I think three months ago ("18 crashes for 13 users for 90 days"). Interestingly that does not yet use generational garbage collection. All of his more recent reports are happening in one of the marker threads. In an earlier version he also had some happening on the main thread. Now I'm not sure about anything anymore of course. Maybe I broke something in Kha a long while ago and never noticed because it's so rare, but do you maybe have any idea what's going on?

RobDangerous avatar Nov 14 '19 00:11 RobDangerous

And now he found something similar looking in the openfl forums: https://community.openfl.org/t/markobjectalloc-crashes-windows-target/10822/7 Will try to dig up what that was exactly.

RobDangerous avatar Nov 14 '19 00:11 RobDangerous

Android always make me think of threading issues - interaction of ui thread and rendering thread. Perhaps accessing a common hash without a mutex - or in openfl style code, changing the display list from a non-render thread. I have also seen (and fixed) crashes when forced allocations happen too quickly - while some background collecting happened, and some issues with allocating the marking work queues. If you disable the multi-threading in the collector (MAX_MARK_THREADS = 1) and still get a crash, that would rule a few things out. If it does crash, you will get much better call-stack information in this case. It could even show an offending object that might need a mutex.

hughsando avatar Nov 14 '19 12:11 hughsando

Thanks for the advice, will now test with only one marker thread for a while. Do I understand you correctly that just accessing some Haxe object from two threads without a mutex can cause that behaviour?

RobDangerous avatar Nov 14 '19 13:11 RobDangerous

Yes, hash, Anon and array object in particular. Arrays should be ok if you don't change the size.

hughsando avatar Nov 15 '19 07:11 hughsando

OK, might be my audio thread then. I'll report back.

RobDangerous avatar Nov 17 '19 14:11 RobDangerous

Fixing parallel accesses in the audio thread didn't help. But the problem doesn't seem to show up when I set MAX_MARK_THREADS to 1.

RobDangerous avatar Nov 28 '19 11:11 RobDangerous