JoltPhysics
JoltPhysics copied to clipboard
Crash on Mac in `JobSolveVelocityConstraints` / `WarmStartVelocityConstraints`
I was doing some testing on an M1 Mac Mini for the first time in a while for my game, and I sadly noticed that my game crashes within a few seconds of the gameplay starting.
I get crash callstacks like the following:
Thread 25 Crashed:: TNative_0
0 Thrive 0x10355c800 void JPH::ContactConstraintManager::WarmStartVelocityConstraints<JPH::CalculateSolverSteps>(unsigned int const*, unsigned int const*, float, JPH::CalculateSolverSteps&) + 144
1 libthrive_native_without_avx.dylib 0x1293d2074 JPH::PhysicsSystem::JobSolveVelocityConstraints(JPH::PhysicsUpdateContext*, JPH::PhysicsUpdateContext::Step*) + 1216 (PhysicsSystem.cpp:1417)
2 libthrive_native_without_avx.dylib 0x1293f3b50 JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12::operator()() const + 44 (PhysicsSystem.cpp:436)
3 libthrive_native_without_avx.dylib 0x1293f3b18 decltype(std::declval<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12&>()()) std::__1::__invoke[abi:nn180100]<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12&>(JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12&) + 24 (invoke.h:344)
4 libthrive_native_without_avx.dylib 0x1293f3ad0 void std::__1::__invoke_void_return_wrapper<void, true>::__call[abi:nn180100]<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12&>(JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12&) + 24 (invoke.h:419)
5 libthrive_native_without_avx.dylib 0x1293f3aac std::__1::__function::__alloc_func<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12, std::__1::allocator<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12>, void ()>::operator()[abi:nn180100]() + 28 (function.h:169)
6 libthrive_native_without_avx.dylib 0x1293f2984 std::__1::__function::__func<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12, std::__1::allocator<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12>, void ()>::operator()() + 28 (function.h:311)
7 libthrive_native_without_avx.dylib 0x128f95bd8 std::__1::__function::__value_func<void ()>::operator()[abi:nn180100]() const + 68 (function.h:428)
8 libthrive_native_without_avx.dylib 0x128f954cc std::__1::function<void ()>::operator()() const + 24 (function.h:981)
9 libthrive_native_without_avx.dylib 0x128f93b44 JPH::JobSystem::Job::Execute() + 140 (JobSystem.h:245)
# Another:
Crashed Thread: 23 TNative_0
Exception Type: EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x00300012005e8614 -> 0x00000012005e8614 (possible pointer authentication failure)
Exception Codes: 0x0000000000000001, 0x00300012005e8614
Thread 23 Crashed:: TNative_0
0 Thrive 0x10324c800 void JPH::ContactConstraintManager::WarmStartVelocityConstraints<JPH::CalculateSolverSteps>(unsigned int const*, unsigned int const*, float, JPH::CalculateSolverSteps&) + 144
1 libthrive_native_without_avx.dylib 0x1367c6074 JPH::PhysicsSystem::JobSolveVelocityConstraints(JPH::PhysicsUpdateContext*, JPH::PhysicsUpdateContext::Step*) + 1216 (PhysicsSystem.cpp:1417)
2 libthrive_native_without_avx.dylib 0x1367e7b50 JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12::operator()() const + 44 (PhysicsSystem.cpp:436)
3 libthrive_native_without_avx.dylib 0x1367e7b18 decltype(std::declval<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12&>()()) std::__1::__invoke[abi:nn180100]<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12&>(JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12&) + 24 (invoke.h:344)
4 libthrive_native_without_avx.dylib 0x1367e7ad0 void std::__1::__invoke_void_return_wrapper<void, true>::__call[abi:nn180100]<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12&>(JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12&) + 24 (invoke.h:419)
5 libthrive_native_without_avx.dylib 0x1367e7aac std::__1::__function::__alloc_func<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12, std::__1::allocator<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12>, void ()>::operator()[abi:nn180100]() + 28 (function.h:169)
6 libthrive_native_without_avx.dylib 0x1367e6984 std::__1::__function::__func<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12, std::__1::allocator<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12>, void ()>::operator()() + 28 (function.h:311)
7 libthrive_native_without_avx.dylib 0x136389bd8 std::__1::__function::__value_func<void ()>::operator()[abi:nn180100]() const + 68 (function.h:428)
8 libthrive_native_without_avx.dylib 0x1363894cc std::__1::function<void ()>::operator()() const + 24 (function.h:981)
9 libthrive_native_without_avx.dylib 0x136387b44 JPH::JobSystem::Job::Execute() + 140 (JobSystem.h:245)
10 libthrive_native_without_avx.dylib 0x1368c49c8 Thrive::TaskSystem::QueuedTask::Invoke() const + 136
11 libthrive_native_without_avx.dylib 0x1368c647c Thrive::TaskSystem::RunTaskThread(int) + 280
# And this is the other location with no WarmStartVelocity in the stack:
Thread 33 Crashed:: TNative_1
0 libsystem_kernel.dylib 0x19a5055d0 __pthread_kill + 8
1 libsystem_pthread.dylib 0x19a53dc20 pthread_kill + 288
2 libsystem_c.dylib 0x19a44aac4 __abort + 136
3 libsystem_c.dylib 0x19a44aa3c abort + 192
4 Godot 0x104cd19d8 0x104508000 + 8165848
5 libcoreclr.dylib 0x12d432998 invoke_previous_action(sigaction*, int, __siginfo*, void*, bool) + 60
6 libsystem_platform.dylib 0x19a56e584 _sigtramp + 56
7 libthrive_native_without_avx.dylib 0x1493a6074 JPH::PhysicsSystem::JobSolveVelocityConstraints(JPH::PhysicsUpdateContext*, JPH::PhysicsUpdateContext::Step*) + 1216 (PhysicsSystem.cpp:1417)
8 libthrive_native_without_avx.dylib 0x1493c7b50 JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12::operator()() const + 44 (PhysicsSystem.cpp:436)
9 libthrive_native_without_avx.dylib 0x1493c7b18 decltype(std::declval<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12&>()()) std::__1::__invoke[abi:nn180100]<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12&>(JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12&) + 24 (invoke.h:344)
Which leads me to think that there's a problem in Jolt. What is very interesting is that this only happens on Mac, so everything is fine on both Linux and Windows. I don't know when this problem has started to happen, I noticed it yesterday when using a Jolt binary built in December, which for some reason worked back then but now consistently causes my game builds to fail on both of my Macs (m1 and Intel)
I just tried updating to the latest master branch commit of Jolt but that didn't seem to do anything to the problem (I didn't fully verify that the new version stuck, so I can do that if this problem should be already solved).
I can get more info on this if I know what I should be looking for. I can easily catch this crash in a debugger to get more info:
In case I might be doing something wrong here's my code that interacts with Jolt: https://github.com/Revolutionary-Games/Thrive/tree/master/src/native/physics
I just synced to the master version of Jolt on my Mac M4 and compiled the Samples and UnitTests. They run without any issues.
The error:
KERN_INVALID_ADDRESS at 0x00300012005e8614 -> 0x00000012005e8614 (possible pointer authentication failure)
says something about pointer authentication failure but that feature is only available on ARM and since you're also crashing on Intel, this must mean that this is memory corruption of some sort.
It says it's crashing in WarmStartVelocity on line 144:
https://github.com/jrouwe/JoltPhysics/blob/fd37495ad743949d0f68956064a8eda0de1d99d0/Jolt/Physics/Constraints/ContactConstraintManager.cpp#L144
but there's no code there so maybe your version is different than the latest version. It would be good to know which line you're actually crashing on and what variable is being accessed there.
In any case, the memory that is most likely written here is memory allocated by the TempAllocator, but I see that you're using one of the standard implementations of this so they should be thread safe.
I also see code that suggests that physics update is done on a separate thread. Maybe you're have some sort of threading issue? You could try if the problem persists if you run it on the main thread and ensure that no other code of your application is running in parallel to Jolt's code.
Disabling multithreading does not seem to help. Now the crash happens on the main thread like this (just as quickly as before):
(I'll interject quickly here and say that even when configuring my physics to block the main thread it still uses one extra worker thread, at least I think that is the case that the crash happened to go on a different thread here, but doing the same run again I can also get the crash to trigger on * thread #1, name = 'TMain', queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x78))
Thread 0 Crashed:: TMain Dispatch queue: com.apple.main-thread
0 libsystem_kernel.dylib 0x19a9295d0 __pthread_kill + 8
1 libsystem_pthread.dylib 0x19a961c20 pthread_kill + 288
2 libsystem_c.dylib 0x19a86eac4 __abort + 136
3 libsystem_c.dylib 0x19a86ea3c abort + 192
4 Godot 0x1056099d8 0x104e40000 + 8165848
5 libcoreclr.dylib 0x117732998 invoke_previous_action(sigaction*, int, __siginfo*, void*, bool) + 60
6 libsystem_platform.dylib 0x19a992584 _sigtramp + 56
7 libthrive_native_without_avx.dylib 0x1319700c0 JPH::PhysicsSystem::JobSolveVelocityConstraints(JPH::PhysicsUpdateContext*, JPH::PhysicsUpdateContext::Step*) + 1216 (PhysicsSystem.cpp:1432)
8 libthrive_native_without_avx.dylib 0x131991d14 JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12::operator()() const + 44 (PhysicsSystem.cpp:439)
9 libthrive_native_without_avx.dylib 0x131991cdc decltype(std::declval<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12&>()()) std::__1::__invoke[abi:nn180100]<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12&>(JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12&) + 24 (invoke.h:344)
10 libthrive_native_without_avx.dylib 0x131991c94 void std::__1::__invoke_void_return_wrapper<void, true>::__call[abi:nn180100]<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12&>(JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12&) + 24 (invoke.h:419)
11 libthrive_native_without_avx.dylib 0x131991c70 std::__1::__function::__alloc_func<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12, std::__1::allocator<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12>, void ()>::operator()[abi:nn180100]() + 28 (function.h:169)
12 libthrive_native_without_avx.dylib 0x131990b48 std::__1::__function::__func<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12, std::__1::allocator<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12>, void ()>::operator()() + 28 (function.h:311)
13 libthrive_native_without_avx.dylib 0x13152cdec std::__1::__function::__value_func<void ()>::operator()[abi:nn180100]() const + 68 (function.h:428)
14 libthrive_native_without_avx.dylib 0x13152c6e0 std::__1::function<void ()>::operator()() const + 24 (function.h:981)
15 libthrive_native_without_avx.dylib 0x13152ad5c JPH::JobSystem::Job::Execute() + 140 (JobSystem.h:245)
16 libthrive_native_without_avx.dylib 0x13152a9dc JPH::JobSystemWithBarrier::BarrierImpl::Wait() + 460 (JobSystemWithBarrier.cpp:140)
17 libthrive_native_without_avx.dylib 0x13152b5f8 JPH::JobSystemWithBarrier::WaitForJobs(JPH::JobSystem::Barrier*) + 68 (JobSystemWithBarrier.cpp:227)
18 libthrive_native_without_avx.dylib 0x131967c70 JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*) + 8604 (PhysicsSystem.cpp:573)
19 libthrive_native_without_avx.dylib 0x131a980b0 Thrive::Physics::PhysicalWorld::StepPhysics(float) + 188
20 libthrive_native_without_avx.dylib 0x131a97fa4 Thrive::Physics::PhysicalWorld::Process(float) + 196
21 libthrive_native_without_avx.dylib 0x131a780e0 ProcessPhysicalWorld + 32
22 ??? 0x310adc8d8 ???
23 ??? 0x310adc740 ???
24 ??? 0x310adc688 ???
25 ??? 0x310adc600 ???
26 ??? 0x310ab1e9c ???
27 ??? 0x310ab1944 ???
28 ??? 0x310a8d644 ???
29 ??? 0x30c2848bc ???
30 ??? 0x30e9be294 ???
31 ??? 0x31045c698 ???
32 ??? 0x31045afd8 ???
33 ??? 0x3104592cc ???
34 ??? 0x30c2839c8 ???
35 Godot 0x107b21dec Node::_notification(int) + 1708
36 Godot 0x109c6a198 Object::notification(int, bool) + 80
37 Godot 0x107b6cde4 SceneTree::_process_group(SceneTree::ProcessGroup*, bool) + 432
38 Godot 0x107b6add8 SceneTree::_process(bool) + 856
39 Godot 0x107b6b5d4 SceneTree::process(double) + 228
40 Godot 0x10567b9c4 Main::iteration() + 1032
41 Godot 0x1056041f0 OS_MacOS::run() + 168
42 Godot 0x105635160 main + 388
43 dyld 0x19a5d7154 start + 2476
Here's what I get with lldb:
Process 2278 stopped
* thread #28, name = 'TNative_0', stop reason = EXC_BAD_ACCESS (code=1, address=0x46700000000)
frame #0: 0x000000010130dd08 godot`void JPH::ContactConstraintManager::WarmStartVelocityConstraints<JPH::CalculateSolverSteps>(unsigned int const*, unsigned int const*, float, JPH::CalculateSolverSteps&) + 140
godot`JPH::ContactConstraintManager::WarmStartVelocityConstraints<JPH::CalculateSolverSteps>:
-> 0x10130dd08 <+140>: ldp x10, x14, [x11]
0x10130dd0c <+144>: ldrb w15, [x10, #0x78]
0x10130dd10 <+148>: ldr x11, [x14, #0x48]
0x10130dd14 <+152>: cmp w15, #0x2
Target 0: (godot) stopped.
(lldb) list
(lldb) bt
* thread #28, name = 'TNative_0', stop reason = EXC_BAD_ACCESS (code=1, address=0x46700000000)
* frame #0: 0x000000010130dd08 godot`void JPH::ContactConstraintManager::WarmStartVelocityConstraints<JPH::CalculateSolverSteps>(unsigned int const*, unsigned int const*, float, JPH::CalculateSolverSteps&) + 140
frame #1: 0x0000000128ef40c0 libthrive_native_without_avx.dylib`JPH::PhysicsSystem::JobSolveVelocityConstraints(this=0x00000003cda25e00, ioContext=0x000000016fdfcb08, ioStep=0x000000013ad00000) at PhysicsSystem.cpp:1432:20
frame #2: 0x0000000128f15d14 libthrive_native_without_avx.dylib`JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12::operator()(this=0x0000000145198c80) const at PhysicsSystem.cpp:439:31
frame #3: 0x0000000128f15cdc libthrive_native_without_avx.dylib`decltype(std::declval<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12&>()()) std::__1::__invoke[abi:nn180100]<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12&>(__f=0x0000000145198c80) at invoke.h:344:25
frame #4: 0x0000000128f15c94 libthrive_native_without_avx.dylib`void std::__1::__invoke_void_return_wrapper<void, true>::__call[abi:nn180100]<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12&>(__args=0x0000000145198c80) at invoke.h:419:5
frame #5: 0x0000000128f15c70 libthrive_native_without_avx.dylib`std::__1::__function::__alloc_func<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12, std::__1::allocator<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12>, void ()>::operator()[abi:nn180100](this=0x0000000145198c80) at function.h:169:12
frame #6: 0x0000000128f14b48 libthrive_native_without_avx.dylib`std::__1::__function::__func<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12, std::__1::allocator<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12>, void ()>::operator()(this=0x0000000145198c78) at function.h:311:10
frame #7: 0x0000000128ab0dec libthrive_native_without_avx.dylib`std::__1::__function::__value_func<void ()>::operator()[abi:nn180100](this=0x0000000145198c78) const at function.h:428:12
frame #8: 0x0000000128ab06e0 libthrive_native_without_avx.dylib`std::__1::function<void ()>::operator()(this=0x0000000145198c78) const at function.h:981:10
frame #9: 0x0000000128aaed5c libthrive_native_without_avx.dylib`JPH::JobSystem::Job::Execute(this=0x0000000145198c58) at JobSystem.h:245:5
frame #10: 0x00000001290078e0 libthrive_native_without_avx.dylib`Thrive::TaskSystem::QueuedTask::Invoke() const + 136
frame #11: 0x0000000129009394 libthrive_native_without_avx.dylib`Thrive::TaskSystem::RunTaskThread(int) + 280
frame #12: 0x0000000129013c00 libthrive_native_without_avx.dylib`decltype(*std::declval<Thrive::TaskSystem*>().*std::declval<void (Thrive::TaskSystem::*)(int)>()(std::declval<int>())) std::__1::__invoke[abi:ne180100]<void (Thrive::TaskSystem::*)(int), Thrive::TaskSystem*, int, void>(void (Thrive::TaskSystem::*&&)(int), Thrive::TaskSystem*&&, int&&) + 88
frame #13: 0x0000000129013b58 libthrive_native_without_avx.dylib`void std::__1::__thread_execute[abi:ne180100]<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, void (Thrive::TaskSystem::*)(int), Thrive::TaskSystem*, int, 2ul, 3ul>(std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, void (Thrive::TaskSystem::*)(int), Thrive::TaskSystem*, int>&, std::__1::__tuple_indices<2ul, 3ul>) + 68
frame #14: 0x00000001290133b8 libthrive_native_without_avx.dylib`void* std::__1::__thread_proxy[abi:ne180100]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, void (Thrive::TaskSystem::*)(int), Thrive::TaskSystem*, int>>(void*) + 88
frame #15: 0x000000019a961f94 libsystem_pthread.dylib`_pthread_start + 136
(lldb) dis
godot`JPH::ContactConstraintManager::WarmStartVelocityConstraints<JPH::CalculateSolverSteps>:
0x10130dc7c <+0>: cmp x1, x2
0x10130dc80 <+4>: b.hs 0x10130e4ec ; <+2160>
0x10130dc84 <+8>: mov w8, #0x360 ; =864
0x10130dc88 <+12>: mov w9, #0xc8 ; =200
0x10130dc8c <+16>: mov x10, #0x1 ; =1
0x10130dc90 <+20>: movk x10, #0x2, lsl #32
0x10130dc94 <+24>: dup.2d v1, x10
0x10130dc98 <+28>: adrp x10, 17902
0x10130dc9c <+32>: ldr q2, [x10, #0x6f0]
0x10130dca0 <+36>: b 0x10130dcfc ; <+128>
0x10130dca4 <+40>: mov x10, x11
0x10130dca8 <+44>: ldrb w11, [x10, #0x7b]
0x10130dcac <+48>: cmp w11, #0x0
0x10130dcb0 <+52>: cset w12, eq
0x10130dcb4 <+56>: ldp w13, w14, [x3, #0x8]
0x10130dcb8 <+60>: cmp w13, w11
0x10130dcbc <+64>: csel w11, w13, w11, hi
0x10130dcc0 <+68>: ldrb w13, [x3, #0x10]
0x10130dcc4 <+72>: orr w12, w13, w12
0x10130dcc8 <+76>: strb w12, [x3, #0x10]
0x10130dccc <+80>: ldrb w10, [x10, #0x7c]
0x10130dcd0 <+84>: cmp w10, #0x0
0x10130dcd4 <+88>: cset w12, eq
0x10130dcd8 <+92>: cmp w14, w10
0x10130dcdc <+96>: csel w10, w14, w10, hi
0x10130dce0 <+100>: stp w11, w10, [x3, #0x8]
0x10130dce4 <+104>: ldrb w10, [x3, #0x11]
0x10130dce8 <+108>: orr w10, w10, w12
0x10130dcec <+112>: strb w10, [x3, #0x11]
0x10130dcf0 <+116>: add x1, x1, #0x4
0x10130dcf4 <+120>: cmp x1, x2
0x10130dcf8 <+124>: b.hs 0x10130e4ec ; <+2160>
0x10130dcfc <+128>: ldr x12, [x0, #0xb0]
0x10130dd00 <+132>: ldr w13, [x1]
0x10130dd04 <+136>: umaddl x11, w13, w8, x12
-> 0x10130dd08 <+140>: ldp x10, x14, [x11]
0x10130dd0c <+144>: ldrb w15, [x10, #0x78]
0x10130dd10 <+148>: ldr x11, [x14, #0x48]
0x10130dd14 <+152>: cmp w15, #0x2
0x10130dd18 <+156>: b.ne 0x10130e06c ; <+1008>
0x10130dd1c <+160>: ldr x10, [x10, #0x48]
0x10130dd20 <+164>: ldrb w14, [x14, #0x78]
0x10130dd24 <+168>: umaddl x15, w13, w8, x12
0x10130dd28 <+172>: ldur q3, [x15, #0x18]
0x10130dd2c <+176>: fabs s7, s3
0x10130dd30 <+180>: mov s6, v3[1]
0x10130dd34 <+184>: fabs s16, s6
0x10130dd38 <+188>: mov s4, v3[2]
0x10130dd3c <+192>: fmul.s s5, s4, v3[2]
0x10130dd40 <+196>: cmp w14, #0x2
0x10130dd44 <+200>: b.ne 0x10130e294 ; <+1560>
0x10130dd48 <+204>: fcmp s7, s16
0x10130dd4c <+208>: fmul s7, s6, s6
0x10130dd50 <+212>: fadd s7, s7, s5
0x10130dd54 <+216>: fsqrt s7, s7
0x10130dd58 <+220>: fneg s6, s6
0x10130dd5c <+224>: movi d16, #0000000000000000
0x10130dd60 <+228>: mov.s v16[1], v4[0]
0x10130dd64 <+232>: fmul.4s v17, v3, v3
0x10130dd68 <+236>: fadd s5, s17, s5
0x10130dd6c <+240>: fsqrt s5, s5
0x10130dd70 <+244>: movi d17, #0000000000000000
0x10130dd74 <+248>: mov.s v17[0], v4[0]
0x10130dd78 <+252>: fneg s18, s3
0x10130dd7c <+256>: fcsel d4, d16, d17, le
0x10130dd80 <+260>: fcsel s6, s6, s18, le
0x10130dd84 <+264>: fcsel s5, s7, s5, le
0x10130dd88 <+268>: umaddl x14, w13, w8, x12
0x10130dd8c <+272>: ldr w14, [x14, #0x38]
0x10130dd90 <+276>: cbz w14, 0x10130e4a0 ; <+2084>
0x10130dd94 <+280>: dup.2s v6, v6[0]
0x10130dd98 <+284>: mov.d v4[1], v6[0]
0x10130dd9c <+288>: dup.4s v5, v5[0]
0x10130dda0 <+292>: fdiv.4s v4, v4, v5
0x10130dda4 <+296>: ext.16b v5, v4, v4, #0xc
0x10130dda8 <+300>: ext.16b v5, v5, v4, #0x8
0x10130ddac <+304>: fmul.4s v5, v3, v5
0x10130ddb0 <+308>: ext.16b v6, v3, v3, #0xc
0x10130ddb4 <+312>: ext.16b v6, v6, v3, #0x8
0x10130ddb8 <+316>: fmul.4s v6, v6, v4
0x10130ddbc <+320>: fsub.4s v6, v5, v6
0x10130ddc0 <+324>: ext.16b v5, v6, v6, #0x4
0x10130ddc4 <+328>: mov.s v5[2], v6[0]
0x10130ddc8 <+332>: umaddl x15, w13, w8, x12
0x10130ddcc <+336>: add x12, x15, #0x28
0x10130ddd0 <+340>: add x13, x15, #0x30
0x10130ddd4 <+344>: umull x14, w14, w9
0x10130ddd8 <+348>: add x15, x15, #0xfc
0x10130dddc <+352>: b 0x10130ddec ; <+368>
0x10130dde0 <+356>: add x15, x15, #0xc8
0x10130dde4 <+360>: subs x14, x14, #0xc8
0x10130dde8 <+364>: b.eq 0x10130e4a0 ; <+2084>
0x10130ddec <+368>: ldur s6, [x15, #-0x4c]
0x10130ddf0 <+372>: fcmp s6, #0.0
0x10130ddf4 <+376>: b.ne 0x10130de04 ; <+392>
0x10130ddf8 <+380>: ldur s6, [x15, #-0xc]
0x10130ddfc <+384>: fcmp s6, #0.0
0x10130de00 <+388>: b.eq 0x10130df9c ; <+800>
0x10130de04 <+392>: ldr s16, [x12]
0x10130de08 <+396>: ldr s6, [x13]
0x10130de0c <+400>: ldur s7, [x15, #-0x40]
0x10130de10 <+404>: fmul s7, s7, s0
0x10130de14 <+408>: stur s7, [x15, #-0x40]
0x10130de18 <+412>: fcmp s7, #0.0
0x10130de1c <+416>: b.eq 0x10130ded8 ; <+604>
0x10130de20 <+420>: fmul s16, s16, s7
0x10130de24 <+424>: fmul.4s v16, v4, v16[0]
0x10130de28 <+428>: ldp q17, q18, [x10]
0x10130de2c <+432>: fsub.4s v16, v17, v16
0x10130de30 <+436>: mov.d x16, v16[1]
0x10130de34 <+440>: ldrb w17, [x10, #0x7a]
0x10130de38 <+444>: dup.4s v17, w17
0x10130de3c <+448>: mov.d x17, v17[1]
0x10130de40 <+452>: fmov x4, d16
0x10130de44 <+456>: and x17, x17, #0x4
0x10130de48 <+460>: and.16b v16, v17, v1
0x10130de4c <+464>: mov.d v16[1], x17
0x10130de50 <+468>: cmeq.4s v16, v16, v2
0x10130de54 <+472>: mov.d x17, v16[1]
0x10130de58 <+476>: fmov x5, d16
0x10130de5c <+480>: and x4, x5, x4
0x10130de60 <+484>: and x16, x17, x16
0x10130de64 <+488>: stp x4, x16, [x10]
0x10130de68 <+492>: ldur q16, [x15, #-0x64]
0x10130de6c <+496>: fmul.4s v16, v16, v7[0]
0x10130de70 <+500>: fsub.4s v16, v18, v16
0x10130de74 <+504>: str q16, [x10, #0x10]
0x10130de78 <+508>: fmul s6, s6, s7
0x10130de7c <+512>: fmul.4s v6, v4, v6[0]
0x10130de80 <+516>: ldp q16, q17, [x11]
0x10130de84 <+520>: fadd.4s v6, v6, v16
0x10130de88 <+524>: mov.d x16, v6[1]
0x10130de8c <+528>: ldrb w17, [x11, #0x7a]
0x10130de90 <+532>: dup.4s v16, w17
0x10130de94 <+536>: mov.d x17, v16[1]
0x10130de98 <+540>: and x17, x17, #0x4
0x10130de9c <+544>: and.16b v16, v16, v1
0x10130dea0 <+548>: mov.d v16[1], x17
0x10130dea4 <+552>: cmeq.4s v16, v16, v2
0x10130dea8 <+556>: mov.d x17, v16[1]
0x10130deac <+560>: fmov x4, d6
0x10130deb0 <+564>: fmov x5, d16
0x10130deb4 <+568>: and x4, x5, x4
0x10130deb8 <+572>: and x16, x17, x16
0x10130debc <+576>: stp x4, x16, [x11]
0x10130dec0 <+580>: ldur q6, [x15, #-0x58]
0x10130dec4 <+584>: fmul.4s v6, v6, v7[0]
0x10130dec8 <+588>: fadd.4s v6, v17, v6
0x10130decc <+592>: str q6, [x11, #0x10]
0x10130ded0 <+596>: ldr s16, [x12]
0x10130ded4 <+600>: ldr s6, [x13]
0x10130ded8 <+604>: ldr s7, [x15]
0x10130dedc <+608>: fmul s7, s7, s0
0x10130dee0 <+612>: str s7, [x15]
0x10130dee4 <+616>: fcmp s7, #0.0
0x10130dee8 <+620>: b.eq 0x10130df9c ; <+800>
0x10130deec <+624>: fmul s16, s16, s7
0x10130def0 <+628>: fmul.4s v16, v5, v16[0]
0x10130def4 <+632>: ldp q17, q18, [x10]
0x10130def8 <+636>: fsub.4s v16, v17, v16
0x10130defc <+640>: mov.d x16, v16[1]
0x10130df00 <+644>: ldrb w17, [x10, #0x7a]
0x10130df04 <+648>: dup.4s v17, w17
0x10130df08 <+652>: mov.d x17, v17[1]
0x10130df0c <+656>: and x17, x17, #0x4
0x10130df10 <+660>: and.16b v17, v17, v1
0x10130df14 <+664>: mov.d v17[1], x17
0x10130df18 <+668>: fmov x17, d16
0x10130df1c <+672>: cmeq.4s v16, v17, v2
0x10130df20 <+676>: mov.d x4, v16[1]
0x10130df24 <+680>: fmov x5, d16
0x10130df28 <+684>: and x17, x5, x17
0x10130df2c <+688>: and x16, x4, x16
0x10130df30 <+692>: stp x17, x16, [x10]
0x10130df34 <+696>: ldur q16, [x15, #-0x24]
0x10130df38 <+700>: fmul.4s v16, v16, v7[0]
0x10130df3c <+704>: fsub.4s v16, v18, v16
0x10130df40 <+708>: str q16, [x10, #0x10]
0x10130df44 <+712>: fmul s6, s6, s7
0x10130df48 <+716>: fmul.4s v6, v5, v6[0]
0x10130df4c <+720>: ldp q16, q17, [x11]
0x10130df50 <+724>: fadd.4s v6, v6, v16
0x10130df54 <+728>: mov.d x16, v6[1]
0x10130df58 <+732>: fmov x17, d6
0x10130df5c <+736>: ldrb w4, [x11, #0x7a]
0x10130df60 <+740>: dup.4s v6, w4
0x10130df64 <+744>: mov.d x4, v6[1]
0x10130df68 <+748>: and x4, x4, #0x4
0x10130df6c <+752>: and.16b v6, v6, v1
0x10130df70 <+756>: mov.d v6[1], x4
0x10130df74 <+760>: cmeq.4s v6, v6, v2
0x10130df78 <+764>: mov.d x4, v6[1]
0x10130df7c <+768>: fmov x5, d6
0x10130df80 <+772>: and x17, x5, x17
0x10130df84 <+776>: and x16, x4, x16
0x10130df88 <+780>: stp x17, x16, [x11]
0x10130df8c <+784>: ldur q6, [x15, #-0x18]
0x10130df90 <+788>: fmul.4s v6, v6, v7[0]
0x10130df94 <+792>: fadd.4s v6, v17, v6
0x10130df98 <+796>: str q6, [x11, #0x10]
0x10130df9c <+800>: ldr s16, [x12]
0x10130dfa0 <+804>: ldr s7, [x13]
0x10130dfa4 <+808>: ldur s6, [x15, #-0x80]
0x10130dfa8 <+812>: fmul s6, s6, s0
0x10130dfac <+816>: stur s6, [x15, #-0x80]
0x10130dfb0 <+820>: fcmp s6, #0.0
0x10130dfb4 <+824>: b.eq 0x10130dde0 ; <+356>
0x10130dfb8 <+828>: fmul s16, s16, s6
0x10130dfbc <+832>: fmul.4s v16, v3, v16[0]
0x10130dfc0 <+836>: ldp q17, q18, [x10]
0x10130dfc4 <+840>: fsub.4s v16, v17, v16
0x10130dfc8 <+844>: mov.d x16, v16[1]
0x10130dfcc <+848>: ldrb w17, [x10, #0x7a]
0x10130dfd0 <+852>: dup.4s v17, w17
0x10130dfd4 <+856>: mov.d x17, v17[1]
0x10130dfd8 <+860>: and x17, x17, #0x4
0x10130dfdc <+864>: and.16b v17, v17, v1
0x10130dfe0 <+868>: mov.d v17[1], x17
0x10130dfe4 <+872>: fmov x17, d16
0x10130dfe8 <+876>: cmeq.4s v16, v17, v2
0x10130dfec <+880>: mov.d x4, v16[1]
0x10130dff0 <+884>: fmov x5, d16
0x10130dff4 <+888>: and x17, x5, x17
0x10130dff8 <+892>: and x16, x4, x16
0x10130dffc <+896>: stp x17, x16, [x10]
0x10130e000 <+900>: ldur q16, [x15, #-0xa4]
0x10130e004 <+904>: fmul.4s v16, v16, v6[0]
0x10130e008 <+908>: fsub.4s v16, v18, v16
0x10130e00c <+912>: str q16, [x10, #0x10]
0x10130e010 <+916>: fmul s7, s7, s6
0x10130e014 <+920>: fmul.4s v7, v3, v7[0]
0x10130e018 <+924>: ldp q16, q17, [x11]
0x10130e01c <+928>: fadd.4s v7, v7, v16
0x10130e020 <+932>: mov.d x16, v7[1]
0x10130e024 <+936>: fmov x17, d7
0x10130e028 <+940>: ldrb w4, [x11, #0x7a]
0x10130e02c <+944>: dup.4s v7, w4
0x10130e030 <+948>: mov.d x4, v7[1]
0x10130e034 <+952>: and x4, x4, #0x4
0x10130e038 <+956>: and.16b v7, v7, v1
0x10130e03c <+960>: mov.d v7[1], x4
0x10130e040 <+964>: cmeq.4s v7, v7, v2
0x10130e044 <+968>: mov.d x4, v7[1]
0x10130e048 <+972>: fmov x5, d7
0x10130e04c <+976>: and x17, x5, x17
0x10130e050 <+980>: and x16, x4, x16
0x10130e054 <+984>: stp x17, x16, [x11]
0x10130e058 <+988>: ldur q7, [x15, #-0x98]
0x10130e05c <+992>: fmul.4s v6, v7, v6[0]
0x10130e060 <+996>: fadd.4s v6, v17, v6
0x10130e064 <+1000>: str q6, [x11, #0x10]
0x10130e068 <+1004>: b 0x10130dde0 ; <+356>
0x10130e06c <+1008>: umaddl x10, w13, w8, x12
0x10130e070 <+1012>: ldur q3, [x10, #0x18]
0x10130e074 <+1016>: fabs s4, s3
0x10130e078 <+1020>: mov s5, v3[1]
0x10130e07c <+1024>: fabs s6, s5
0x10130e080 <+1028>: mov s7, v3[2]
0x10130e084 <+1032>: fmul.s s16, s7, v3[2]
0x10130e088 <+1036>: fcmp s4, s6
0x10130e08c <+1040>: fmul s4, s5, s5
0x10130e090 <+1044>: fadd s4, s4, s16
0x10130e094 <+1048>: fsqrt s17, s4
0x10130e098 <+1052>: fneg s5, s5
0x10130e09c <+1056>: movi d4, #0000000000000000
0x10130e0a0 <+1060>: mov.s v4[1], v7[0]
0x10130e0a4 <+1064>: fmul.4s v6, v3, v3
0x10130e0a8 <+1068>: fadd s6, s6, s16
0x10130e0ac <+1072>: fsqrt s16, s6
0x10130e0b0 <+1076>: movi d6, #0000000000000000
0x10130e0b4 <+1080>: mov.s v6[0], v7[0]
0x10130e0b8 <+1084>: fneg s7, s3
0x10130e0bc <+1088>: fcsel d4, d4, d6, le
0x10130e0c0 <+1092>: fcsel s6, s5, s7, le
0x10130e0c4 <+1096>: fcsel s5, s17, s16, le
0x10130e0c8 <+1100>: umaddl x10, w13, w8, x12
0x10130e0cc <+1104>: ldr w14, [x10, #0x38]
0x10130e0d0 <+1108>: cbz w14, 0x10130dca4 ; <+40>
0x10130e0d4 <+1112>: dup.2s v6, v6[0]
0x10130e0d8 <+1116>: mov.d v4[1], v6[0]
0x10130e0dc <+1120>: dup.4s v5, v5[0]
0x10130e0e0 <+1124>: fdiv.4s v4, v4, v5
0x10130e0e4 <+1128>: ext.16b v5, v4, v4, #0xc
0x10130e0e8 <+1132>: ext.16b v5, v5, v4, #0x8
0x10130e0ec <+1136>: fmul.4s v5, v3, v5
0x10130e0f0 <+1140>: ext.16b v6, v3, v3, #0xc
0x10130e0f4 <+1144>: ext.16b v6, v6, v3, #0x8
0x10130e0f8 <+1148>: fmul.4s v6, v6, v4
0x10130e0fc <+1152>: fsub.4s v6, v5, v6
0x10130e100 <+1156>: ext.16b v5, v6, v6, #0x4
0x10130e104 <+1160>: mov.s v5[2], v6[0]
0x10130e108 <+1164>: umaddl x12, w13, w8, x12
0x10130e10c <+1168>: add x10, x12, #0x30
0x10130e110 <+1172>: add x12, x12, #0xfc
0x10130e114 <+1176>: umull x13, w14, w9
0x10130e118 <+1180>: b 0x10130e128 ; <+1196>
0x10130e11c <+1184>: add x12, x12, #0xc8
0x10130e120 <+1188>: subs x13, x13, #0xc8
0x10130e124 <+1192>: b.eq 0x10130dca4 ; <+40>
0x10130e128 <+1196>: ldur s6, [x12, #-0x4c]
0x10130e12c <+1200>: fcmp s6, #0.0
0x10130e130 <+1204>: b.ne 0x10130e140 ; <+1220>
0x10130e134 <+1208>: ldur s6, [x12, #-0xc]
0x10130e138 <+1212>: fcmp s6, #0.0
0x10130e13c <+1216>: b.eq 0x10130e220 ; <+1444>
0x10130e140 <+1220>: ldr s7, [x10]
0x10130e144 <+1224>: ldur s6, [x12, #-0x40]
0x10130e148 <+1228>: fmul s6, s6, s0
0x10130e14c <+1232>: stur s6, [x12, #-0x40]
0x10130e150 <+1236>: fcmp s6, #0.0
0x10130e154 <+1240>: b.eq 0x10130e1b4 ; <+1336>
0x10130e158 <+1244>: fmul s7, s7, s6
0x10130e15c <+1248>: fmul.4s v7, v4, v7[0]
0x10130e160 <+1252>: ldp q16, q17, [x11]
0x10130e164 <+1256>: fadd.4s v7, v7, v16
0x10130e168 <+1260>: mov.d x14, v7[1]
0x10130e16c <+1264>: ldrb w15, [x11, #0x7a]
0x10130e170 <+1268>: dup.4s v16, w15
0x10130e174 <+1272>: mov.d x15, v16[1]
0x10130e178 <+1276>: fmov x16, d7
0x10130e17c <+1280>: and x15, x15, #0x4
0x10130e180 <+1284>: and.16b v7, v16, v1
0x10130e184 <+1288>: mov.d v7[1], x15
0x10130e188 <+1292>: cmeq.4s v7, v7, v2
0x10130e18c <+1296>: mov.d x15, v7[1]
0x10130e190 <+1300>: fmov x17, d7
0x10130e194 <+1304>: and x16, x17, x16
0x10130e198 <+1308>: and x14, x15, x14
0x10130e19c <+1312>: stp x16, x14, [x11]
0x10130e1a0 <+1316>: ldur q7, [x12, #-0x58]
0x10130e1a4 <+1320>: fmul.4s v6, v7, v6[0]
0x10130e1a8 <+1324>: fadd.4s v6, v17, v6
0x10130e1ac <+1328>: str q6, [x11, #0x10]
0x10130e1b0 <+1332>: ldr s7, [x10]
0x10130e1b4 <+1336>: ldr s6, [x12]
0x10130e1b8 <+1340>: fmul s6, s6, s0
0x10130e1bc <+1344>: str s6, [x12]
0x10130e1c0 <+1348>: fcmp s6, #0.0
0x10130e1c4 <+1352>: b.eq 0x10130e220 ; <+1444>
0x10130e1c8 <+1356>: fmul s7, s7, s6
0x10130e1cc <+1360>: fmul.4s v7, v5, v7[0]
0x10130e1d0 <+1364>: ldp q16, q17, [x11]
0x10130e1d4 <+1368>: fadd.4s v7, v7, v16
0x10130e1d8 <+1372>: mov.d x14, v7[1]
0x10130e1dc <+1376>: fmov x15, d7
0x10130e1e0 <+1380>: ldrb w16, [x11, #0x7a]
0x10130e1e4 <+1384>: dup.4s v7, w16
0x10130e1e8 <+1388>: mov.d x16, v7[1]
0x10130e1ec <+1392>: and x16, x16, #0x4
0x10130e1f0 <+1396>: and.16b v7, v7, v1
0x10130e1f4 <+1400>: mov.d v7[1], x16
0x10130e1f8 <+1404>: cmeq.4s v7, v7, v2
0x10130e1fc <+1408>: mov.d x16, v7[1]
0x10130e200 <+1412>: fmov x17, d7
0x10130e204 <+1416>: and x15, x17, x15
0x10130e208 <+1420>: and x14, x16, x14
0x10130e20c <+1424>: stp x15, x14, [x11]
0x10130e210 <+1428>: ldur q7, [x12, #-0x18]
0x10130e214 <+1432>: fmul.4s v6, v7, v6[0]
0x10130e218 <+1436>: fadd.4s v6, v17, v6
0x10130e21c <+1440>: str q6, [x11, #0x10]
0x10130e220 <+1444>: ldr s7, [x10]
0x10130e224 <+1448>: ldur s6, [x12, #-0x80]
0x10130e228 <+1452>: fmul s6, s6, s0
0x10130e22c <+1456>: stur s6, [x12, #-0x80]
0x10130e230 <+1460>: fcmp s6, #0.0
0x10130e234 <+1464>: b.eq 0x10130e11c ; <+1184>
0x10130e238 <+1468>: fmul s7, s7, s6
0x10130e23c <+1472>: fmul.4s v7, v3, v7[0]
0x10130e240 <+1476>: ldp q16, q17, [x11]
0x10130e244 <+1480>: fadd.4s v7, v7, v16
0x10130e248 <+1484>: mov.d x14, v7[1]
0x10130e24c <+1488>: fmov x15, d7
0x10130e250 <+1492>: ldrb w16, [x11, #0x7a]
0x10130e254 <+1496>: dup.4s v7, w16
0x10130e258 <+1500>: mov.d x16, v7[1]
0x10130e25c <+1504>: and x16, x16, #0x4
0x10130e260 <+1508>: and.16b v7, v7, v1
0x10130e264 <+1512>: mov.d v7[1], x16
0x10130e268 <+1516>: cmeq.4s v7, v7, v2
0x10130e26c <+1520>: mov.d x16, v7[1]
0x10130e270 <+1524>: fmov x17, d7
0x10130e274 <+1528>: and x15, x17, x15
0x10130e278 <+1532>: and x14, x16, x14
0x10130e27c <+1536>: stp x15, x14, [x11]
0x10130e280 <+1540>: ldur q7, [x12, #-0x98]
0x10130e284 <+1544>: fmul.4s v6, v7, v6[0]
0x10130e288 <+1548>: fadd.4s v6, v17, v6
0x10130e28c <+1552>: str q6, [x11, #0x10]
0x10130e290 <+1556>: b 0x10130e11c ; <+1184>
0x10130e294 <+1560>: fcmp s7, s16
0x10130e298 <+1564>: fmul s7, s6, s6
0x10130e29c <+1568>: fadd s7, s7, s5
0x10130e2a0 <+1572>: fsqrt s7, s7
0x10130e2a4 <+1576>: fneg s6, s6
0x10130e2a8 <+1580>: movi d16, #0000000000000000
0x10130e2ac <+1584>: mov.s v16[1], v4[0]
0x10130e2b0 <+1588>: fmul.4s v17, v3, v3
0x10130e2b4 <+1592>: fadd s5, s17, s5
0x10130e2b8 <+1596>: fsqrt s5, s5
0x10130e2bc <+1600>: movi d17, #0000000000000000
0x10130e2c0 <+1604>: mov.s v17[0], v4[0]
0x10130e2c4 <+1608>: fneg s18, s3
0x10130e2c8 <+1612>: fcsel d4, d16, d17, le
0x10130e2cc <+1616>: fcsel s6, s6, s18, le
0x10130e2d0 <+1620>: fcsel s5, s7, s5, le
0x10130e2d4 <+1624>: umaddl x11, w13, w8, x12
0x10130e2d8 <+1628>: ldr w14, [x11, #0x38]
0x10130e2dc <+1632>: cbz w14, 0x10130dca8 ; <+44>
0x10130e2e0 <+1636>: dup.2s v6, v6[0]
0x10130e2e4 <+1640>: mov.d v4[1], v6[0]
0x10130e2e8 <+1644>: dup.4s v5, v5[0]
0x10130e2ec <+1648>: fdiv.4s v4, v4, v5
0x10130e2f0 <+1652>: ext.16b v5, v4, v4, #0xc
0x10130e2f4 <+1656>: ext.16b v5, v5, v4, #0x8
0x10130e2f8 <+1660>: fmul.4s v5, v3, v5
0x10130e2fc <+1664>: ext.16b v6, v3, v3, #0xc
0x10130e300 <+1668>: ext.16b v6, v6, v3, #0x8
0x10130e304 <+1672>: fmul.4s v6, v6, v4
0x10130e308 <+1676>: fsub.4s v6, v5, v6
0x10130e30c <+1680>: ext.16b v5, v6, v6, #0x4
0x10130e310 <+1684>: mov.s v5[2], v6[0]
0x10130e314 <+1688>: umaddl x12, w13, w8, x12
0x10130e318 <+1692>: add x11, x12, #0x28
0x10130e31c <+1696>: add x12, x12, #0xfc
0x10130e320 <+1700>: umull x13, w14, w9
0x10130e324 <+1704>: b 0x10130e334 ; <+1720>
0x10130e328 <+1708>: add x12, x12, #0xc8
0x10130e32c <+1712>: subs x13, x13, #0xc8
0x10130e330 <+1716>: b.eq 0x10130dca8 ; <+44>
0x10130e334 <+1720>: ldur s6, [x12, #-0x4c]
0x10130e338 <+1724>: fcmp s6, #0.0
0x10130e33c <+1728>: b.ne 0x10130e34c ; <+1744>
0x10130e340 <+1732>: ldur s6, [x12, #-0xc]
0x10130e344 <+1736>: fcmp s6, #0.0
0x10130e348 <+1740>: b.eq 0x10130e42c ; <+1968>
0x10130e34c <+1744>: ldr s7, [x11]
0x10130e350 <+1748>: ldur s6, [x12, #-0x40]
0x10130e354 <+1752>: fmul s6, s6, s0
0x10130e358 <+1756>: stur s6, [x12, #-0x40]
0x10130e35c <+1760>: fcmp s6, #0.0
0x10130e360 <+1764>: b.eq 0x10130e3c0 ; <+1860>
0x10130e364 <+1768>: fmul s7, s7, s6
0x10130e368 <+1772>: fmul.4s v7, v4, v7[0]
0x10130e36c <+1776>: ldp q16, q17, [x10]
0x10130e370 <+1780>: fsub.4s v7, v16, v7
0x10130e374 <+1784>: mov.d x14, v7[1]
0x10130e378 <+1788>: ldrb w15, [x10, #0x7a]
0x10130e37c <+1792>: dup.4s v16, w15
0x10130e380 <+1796>: mov.d x15, v16[1]
0x10130e384 <+1800>: fmov x16, d7
0x10130e388 <+1804>: and x15, x15, #0x4
0x10130e38c <+1808>: and.16b v7, v16, v1
0x10130e390 <+1812>: mov.d v7[1], x15
0x10130e394 <+1816>: cmeq.4s v7, v7, v2
0x10130e398 <+1820>: mov.d x15, v7[1]
0x10130e39c <+1824>: fmov x17, d7
0x10130e3a0 <+1828>: and x16, x17, x16
0x10130e3a4 <+1832>: and x14, x15, x14
0x10130e3a8 <+1836>: stp x16, x14, [x10]
0x10130e3ac <+1840>: ldur q7, [x12, #-0x64]
0x10130e3b0 <+1844>: fmul.4s v6, v7, v6[0]
0x10130e3b4 <+1848>: fsub.4s v6, v17, v6
0x10130e3b8 <+1852>: str q6, [x10, #0x10]
0x10130e3bc <+1856>: ldr s7, [x11]
0x10130e3c0 <+1860>: ldr s6, [x12]
0x10130e3c4 <+1864>: fmul s6, s6, s0
0x10130e3c8 <+1868>: str s6, [x12]
0x10130e3cc <+1872>: fcmp s6, #0.0
0x10130e3d0 <+1876>: b.eq 0x10130e42c ; <+1968>
0x10130e3d4 <+1880>: fmul s7, s7, s6
0x10130e3d8 <+1884>: fmul.4s v7, v5, v7[0]
0x10130e3dc <+1888>: ldp q16, q17, [x10]
0x10130e3e0 <+1892>: fsub.4s v7, v16, v7
0x10130e3e4 <+1896>: mov.d x14, v7[1]
0x10130e3e8 <+1900>: fmov x15, d7
0x10130e3ec <+1904>: ldrb w16, [x10, #0x7a]
0x10130e3f0 <+1908>: dup.4s v7, w16
0x10130e3f4 <+1912>: mov.d x16, v7[1]
0x10130e3f8 <+1916>: and x16, x16, #0x4
0x10130e3fc <+1920>: and.16b v7, v7, v1
0x10130e400 <+1924>: mov.d v7[1], x16
0x10130e404 <+1928>: cmeq.4s v7, v7, v2
0x10130e408 <+1932>: mov.d x16, v7[1]
0x10130e40c <+1936>: fmov x17, d7
0x10130e410 <+1940>: and x15, x17, x15
0x10130e414 <+1944>: and x14, x16, x14
0x10130e418 <+1948>: stp x15, x14, [x10]
0x10130e41c <+1952>: ldur q7, [x12, #-0x24]
0x10130e420 <+1956>: fmul.4s v6, v7, v6[0]
0x10130e424 <+1960>: fsub.4s v6, v17, v6
0x10130e428 <+1964>: str q6, [x10, #0x10]
0x10130e42c <+1968>: ldr s7, [x11]
0x10130e430 <+1972>: ldur s6, [x12, #-0x80]
0x10130e434 <+1976>: fmul s6, s6, s0
0x10130e438 <+1980>: stur s6, [x12, #-0x80]
0x10130e43c <+1984>: fcmp s6, #0.0
0x10130e440 <+1988>: b.eq 0x10130e328 ; <+1708>
0x10130e444 <+1992>: fmul s7, s7, s6
0x10130e448 <+1996>: fmul.4s v7, v3, v7[0]
0x10130e44c <+2000>: ldp q16, q17, [x10]
0x10130e450 <+2004>: fsub.4s v7, v16, v7
0x10130e454 <+2008>: mov.d x14, v7[1]
0x10130e458 <+2012>: fmov x15, d7
0x10130e45c <+2016>: ldrb w16, [x10, #0x7a]
0x10130e460 <+2020>: dup.4s v7, w16
0x10130e464 <+2024>: mov.d x16, v7[1]
0x10130e468 <+2028>: and x16, x16, #0x4
0x10130e46c <+2032>: and.16b v7, v7, v1
0x10130e470 <+2036>: mov.d v7[1], x16
0x10130e474 <+2040>: cmeq.4s v7, v7, v2
0x10130e478 <+2044>: mov.d x16, v7[1]
0x10130e47c <+2048>: fmov x17, d7
0x10130e480 <+2052>: and x15, x17, x15
0x10130e484 <+2056>: and x14, x16, x14
0x10130e488 <+2060>: stp x15, x14, [x10]
0x10130e48c <+2064>: ldur q7, [x12, #-0xa4]
0x10130e490 <+2068>: fmul.4s v6, v7, v6[0]
0x10130e494 <+2072>: fsub.4s v6, v17, v6
0x10130e498 <+2076>: str q6, [x10, #0x10]
0x10130e49c <+2080>: b 0x10130e328 ; <+1708>
0x10130e4a0 <+2084>: ldrb w12, [x11, #0x7b]
0x10130e4a4 <+2088>: cmp w12, #0x0
0x10130e4a8 <+2092>: cset w13, eq
0x10130e4ac <+2096>: ldp w14, w15, [x3, #0x8]
0x10130e4b0 <+2100>: cmp w14, w12
0x10130e4b4 <+2104>: csel w12, w14, w12, hi
0x10130e4b8 <+2108>: ldrb w14, [x3, #0x10]
0x10130e4bc <+2112>: orr w13, w14, w13
0x10130e4c0 <+2116>: strb w13, [x3, #0x10]
0x10130e4c4 <+2120>: ldrb w11, [x11, #0x7c]
0x10130e4c8 <+2124>: cmp w11, #0x0
0x10130e4cc <+2128>: cset w13, eq
0x10130e4d0 <+2132>: cmp w15, w11
0x10130e4d4 <+2136>: csel w11, w15, w11, hi
0x10130e4d8 <+2140>: stp w12, w11, [x3, #0x8]
0x10130e4dc <+2144>: ldrb w11, [x3, #0x11]
0x10130e4e0 <+2148>: orr w11, w11, w13
0x10130e4e4 <+2152>: strb w11, [x3, #0x11]
0x10130e4e8 <+2156>: b 0x10130dca8 ; <+44>
0x10130e4ec <+2160>: ret
(lldb) frame 1
invalid command 'frame 1'.
(lldb) frame select 1
frame #1: 0x0000000128ef40c0 libthrive_native_without_avx.dylib`JPH::PhysicsSystem::JobSolveVelocityConstraints(this=0x00000003cda25e00, ioContext=0x000000016fdfcb08, ioStep=0x000000013ad00000) at PhysicsSystem.cpp:1432:20
1429
1430 // We didn't create a split, just run the solver now for this entire island. Begin by warm starting.
1431 ConstraintManager::sWarmStartVelocityConstraints(active_constraints, constraints_begin, constraints_end, warm_start_impulse_ratio, steps_calculator);
-> 1432 mContactManager.WarmStartVelocityConstraints(contacts_begin, contacts_end, warm_start_impulse_ratio, steps_calculator);
1433 steps_calculator.Finalize();
1434
1435 // Store the number of position steps for later
So it sadly looks like I have somewhat optimized build as there's no source code associated with the topmost frame in the callstack. I'll investigate why that is but what I got so far pretty much matches the earlier stuff, the PhysicsSystem.cpp:1432 line matches the crash callstacks on my system and that should be the latest version from master. I'll report back again a bit later but for now I need to prepare a test build of my game (sadly I think I'll need to skip the Mac version as it specifically has this issue).
You could try switching to JobSystemSingleThreaded to get rid of multithreading completely on Jolt's side.
Other than that I think it is indeed important to get a debug build running as it does a lot more error checking and will give us better line info.
I verified that I am using the latest library I compiled. I switched to that single threaded version, and I even tried to force -O0 compile flag, but none of that seems to make source location available for where the crash happens. Here's the new callstack where the last line number is changed by 1 as I tried adding a blank line to verify things are building:
* thread #1, name = 'TMain', queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
* frame #0: 0x0000000103f79d08 Godot`void JPH::ContactConstraintManager::WarmStartVelocityConstraints<JPH::CalculateSolverSteps>(unsigned int const*, unsigned int const*, float, JPH::CalculateSolverSteps&) + 140
frame #1: 0x000000013bff2188 libthrive_native_without_avx.dylib`JPH::PhysicsSystem::JobSolveVelocityConstraints(this=0x000000015fd02000, ioContext=0x000000016d190ef8, ioStep=0x0000000366800000) at PhysicsSystem.cpp:1433:20
frame #2: 0x000000013c013c54 libthrive_native_without_avx.dylib`JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12::operator()(this=0x0000000141fb0488) const at PhysicsSystem.cpp:439:31
frame #3: 0x000000013c013c1c libthrive_native_without_avx.dylib`decltype(std::declval<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12&>()()) std::__1::__invoke[abi:nn180100]<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12&>(__f=0x0000000141fb0488) at invoke.h:344:25
frame #4: 0x000000013c013bd4 libthrive_native_without_avx.dylib`void std::__1::__invoke_void_return_wrapper<void, true>::__call[abi:nn180100]<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12&>(__args=0x0000000141fb0488) at invoke.h:419:5
frame #5: 0x000000013c013bb0 libthrive_native_without_avx.dylib`std::__1::__function::__alloc_func<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12, std::__1::allocator<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12>, void ()>::operator()[abi:nn180100](this=0x0000000141fb0488) at function.h:169:12
frame #6: 0x000000013c012a88 libthrive_native_without_avx.dylib`std::__1::__function::__func<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12, std::__1::allocator<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12>, void ()>::operator()(this=0x0000000141fb0480) at function.h:311:10
frame #7: 0x000000013bbac394 libthrive_native_without_avx.dylib`std::__1::__function::__value_func<void ()>::operator()[abi:nn180100](this=0x0000000141fb0480) const at function.h:428:12
frame #8: 0x000000013bbabab4 libthrive_native_without_avx.dylib`std::__1::function<void ()>::operator()(this=0x0000000141fb0480) const at function.h:981:10
frame #9: 0x000000013bbab298 libthrive_native_without_avx.dylib`JPH::JobSystem::Job::Execute(this=0x0000000141fb0460) at JobSystem.h:245:5
frame #10: 0x000000013bbab110 libthrive_native_without_avx.dylib`JPH::JobSystemSingleThreaded::QueueJob(this=0x00000001775e5500, inJob=0x0000000141fb0460) at JobSystemSingleThreaded.cpp:41:9
frame #11: 0x000000013bbab45c libthrive_native_without_avx.dylib`JPH::JobSystemSingleThreaded::QueueJobs(this=0x00000001775e5500, inJobs=0x000000016d18c690, inNumJobs=1) at JobSystemSingleThreaded.cpp:47:3
frame #12: 0x000000013c004d58 libthrive_native_without_avx.dylib`JPH::JobSystem::JobHandle::sRemoveDependencies(inHandles=0x0000000366801758, inNumHandles=1, inCount=1) at JobSystem.inl:53:15
frame #13: 0x000000013bfea874 libthrive_native_without_avx.dylib`void JPH::JobSystem::JobHandle::sRemoveDependencies<32u>(inHandles=0x0000000366801750, inCount=1) at JobSystem.h:114:4
frame #14: 0x000000013bfe922c libthrive_native_without_avx.dylib`JPH::PhysicsSystem::Update(this=0x000000015fd02000, inDeltaTime=0.0166666675, inCollisionSteps=1, inTempAllocator=0x00006000024d47a0, inJobSystem=0x00000001775e5500) at PhysicsSystem.cpp:461:4
frame #15: 0x000000013c119adc libthrive_native_without_avx.dylib`Thrive::Physics::PhysicalWorld::StepPhysics(float) + 196
Running in lldb again I set a breakpoint there in that PhysicsSystem.cpp:1433 and single stepped instructions until it gets stuck as the debugger gets stuck before the crash each time (I omitted a few steps for brevity):
Process 5807 stopped
* thread #1, name = 'TMain', queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
frame #0: 0x000000012e96e174 libthrive_native_without_avx.dylib`JPH::PhysicsSystem::JobSolveVelocityConstraints(this=0x000000039b887a00, ioContext=0x000000016fdfcb28, ioStep=0x0000000169800000) at PhysicsSystem.cpp:1433:4
1430 // We didn't create a split, just run the solver now for this entire island. Begin by warm starting.
1431 ConstraintManager::sWarmStartVelocityConstraints(active_constraints, constraints_begin, constraints_end, warm_start_impulse_ratio, steps_calculator);
1432
-> 1433 mContactManager.WarmStartVelocityConstraints(contacts_begin, contacts_end, warm_start_impulse_ratio, steps_calculator);
(lldb)
Process 5830 stopped
* thread #1, name = 'TMain', queue = 'com.apple.main-thread', stop reason = instruction step into
frame #0: 0x000000010130dd00 godot`void JPH::ContactConstraintManager::WarmStartVelocityConstraints<JPH::CalculateSolverSteps>(unsigned int const*, unsigned int const*, float, JPH::CalculateSolverSteps&) + 132
godot`JPH::ContactConstraintManager::WarmStartVelocityConstraints<JPH::CalculateSolverSteps>:
-> 0x10130dd00 <+132>: ldr w13, [x1]
0x10130dd04 <+136>: umaddl x11, w13, w8, x12
0x10130dd08 <+140>: ldp x10, x14, [x11]
0x10130dd0c <+144>: ldrb w15, [x10, #0x78]
Target 0: (godot) stopped.
(lldb)
Process 5830 stopped
* thread #1, name = 'TMain', queue = 'com.apple.main-thread', stop reason = instruction step into
frame #0: 0x000000010130dd04 godot`void JPH::ContactConstraintManager::WarmStartVelocityConstraints<JPH::CalculateSolverSteps>(unsigned int const*, unsigned int const*, float, JPH::CalculateSolverSteps&) + 136
godot`JPH::ContactConstraintManager::WarmStartVelocityConstraints<JPH::CalculateSolverSteps>:
-> 0x10130dd04 <+136>: umaddl x11, w13, w8, x12
0x10130dd08 <+140>: ldp x10, x14, [x11]
0x10130dd0c <+144>: ldrb w15, [x10, #0x78]
0x10130dd10 <+148>: ldr x11, [x14, #0x48]
Target 0: (godot) stopped.
(lldb)
Process 5830 stopped
* thread #1, name = 'TMain', queue = 'com.apple.main-thread', stop reason = instruction step into
frame #0: 0x000000010130dd08 godot`void JPH::ContactConstraintManager::WarmStartVelocityConstraints<JPH::CalculateSolverSteps>(unsigned int const*, unsigned int const*, float, JPH::CalculateSolverSteps&) + 140
godot`JPH::ContactConstraintManager::WarmStartVelocityConstraints<JPH::CalculateSolverSteps>:
-> 0x10130dd08 <+140>: ldp x10, x14, [x11]
0x10130dd0c <+144>: ldrb w15, [x10, #0x78]
0x10130dd10 <+148>: ldr x11, [x14, #0x48]
0x10130dd14 <+152>: cmp w15, #0x2
(lldb) si
Process 5830 stopped
* thread #1, name = 'TMain', queue = 'com.apple.main-thread', stop reason = instruction step into
frame #0: 0x000000010130dd0c godot`void JPH::ContactConstraintManager::WarmStartVelocityConstraints<JPH::CalculateSolverSteps>(unsigned int const*, unsigned int const*, float, JPH::CalculateSolverSteps&) + 144
godot`JPH::ContactConstraintManager::WarmStartVelocityConstraints<JPH::CalculateSolverSteps>:
-> 0x10130dd0c <+144>: ldrb w15, [x10, #0x78]
0x10130dd10 <+148>: ldr x11, [x14, #0x48]
0x10130dd14 <+152>: cmp w15, #0x2
0x10130dd18 <+156>: b.ne 0x10130e06c ; <+1008>
I've tried multiple times and the stepping always fails after void JPH::ContactConstraintManager::WarmStartVelocityConstraints<JPH::CalculateSolverSteps>(unsigned int const*, unsigned int const*, float, JPH::CalculateSolverSteps&) + 144 so that instruction at offset 144 is the one that always causes a problem of some kind.
If I undestood the disassembly for arm correctly, it tries to read something with a pointer offset, the data in the register seems kind of okay, but I'll try to run again and peek at the memory before the read:
-> 0x10130dd0c <+144>: ldrb w15, [x10, #0x78]
0x10130dd10 <+148>: ldr x11, [x14, #0x48]
0x10130dd14 <+152>: cmp w15, #0x2
0x10130dd18 <+156>: b.ne 0x10130e06c ; <+1008>
Target 0: (godot) stopped.
(lldb) register read --all
General Purpose Registers:
x0 = 0x000000016b25f488
x1 = 0x0000000139115dc0
x2 = 0x0000000139115dc4
x3 = 0x000000016fdf7fb8
x4 = 0x000000016b25f488
x5 = 0x0000000000000000
x6 = 0x000000016fdf7fb8
x7 = 0x0000000000000001
x8 = 0x0000000000000360
x9 = 0x00000000000000c8
x10 = 0x00a5002d001e804d
If that doesn't lead to more clues then I'm about at the end of my debugging abilities / ideas on what to try...
Edit: actually sometimes the debugging gets stuck already at instruction offset 140 instead of 144, so that seems likely the place where the problems start.
I need to stop looking for now, but here's what I think I've found.
Either the x11 or x10 register gets loaded with a pointer to an invalid memory location, and that crashes at either the instruction offset 140 or 144. I don't really understand the math that is used on the other registers to get the results. But here's some of the intermediate values with then me checking that the final calculated x10 is not a valid memory address and will cause a crash:
The structure of the assembly kind of looks like it matches the loop in the source code but I'm not familiar enough with debugging just disassembly to really tell what's going on:
template <class MotionPropertiesCallback>
void ContactConstraintManager::WarmStartVelocityConstraints(const uint32 *inConstraintIdxBegin, const uint32 *inConstraintIdxEnd, float inWarmStartImpulseRatio, MotionPropertiesCallback &ioCallback)
{
JPH_PROFILE_FUNCTION();
for (const uint32 *constraint_idx = inConstraintIdxBegin; constraint_idx < inConstraintIdxEnd; ++constraint_idx)
{
ContactConstraint &constraint = mConstraints[*constraint_idx];
// Fetch bodies
Body &body1 = *constraint.mBody1;
EMotionType motion_type1 = body1.GetMotionType();
MotionProperties *motion_properties1 = body1.GetMotionPropertiesUnchecked();
Unfortunately this doesn't give me much extra information. I think you really need to build a proper debug build (define JPH_DEBUG).
As far as I can tell, I already had that on as if I put add_compile_definitions("-DJPH_DEBUG=1") into my main CMakeLists.txt file, I get a compile error:
/Users/hhyyrylainen/Projects/Thrive/third_party/JoltPhysics/Build/../Jolt/Core/Core.h:503:10: error: 'JPH_DEBUG' macro redefined [-Werror,-Wmacro-redefined]
503 | #define JPH_DEBUG
| ^
<command line>:4:9: note: previous definition is here
4 | #define JPH_DEBUG 1
| ^
1 error generated.
I'm also trying everything else that might make Apple Clang not optimize that function.
Ok, if that's the case and there are no asserts firing then that's going to make it really difficult to debug.
Did you try compiling Jolt's unit tests/samples on your M1 to see if they work?
The unit tests seem to succeed, but I can't run the samples as apparently that requires MacOS 15, but I'm still on 14 (with the hope that my compiled software would be more compatible with older MacOS versions). I suppose I could theoretically update to the latest MacOS version and install the latest XCode version to see if that solves anything.
I found out that I had unintentionally enabled LTO on my library into which the Jolt.a was merged into. But sadly even when disabling that, which did make a few more of my breakpoints work, that still refuses to allow me to debug on the line level in the WarmStartVelocityConstraints function.
Edit: updating my Mac allowed me to run the samples and that didn't turn up any issues (I let it run for about 30 minutes but it seemed like it was still running different samples by that point).
I think I found something!
After failing to defeat the compiler, I literally copy-pasted the code for WarmStartVelocityConstraints to where it was called and now I was able to see which line triggered a crash. And I found a case where constraints_begin is null. Putting in a JPH_ASSERT(constraints_begin != nullptr); seems to confirm that that hits always before the crash inside WarmStartVelocityConstraints. Looking a bit up, it kind of looks like the situation where there is no constraints but there is a contact is not handled correctly:
Could that be it? It seems like it would make sense to skip the code that tries to read a null pointer but I don't know enough about Jolt to be sure if that's safe to do.
Edit: well I tried adding if (active_constraints == nullptr) continue; but that's clearly wrong as now some physics collisions just get ignored. At least it stops all crashing...
It's quite normal for constraints_begin to be null, if this is the case then constraints_end should also be null, has_constraints should be false and it just indicates that this particular island doesn't have any constraints (JPH::Constraint) that connect bodies in this island. The island in that case could have contacts (or could have only a single body that doesn't collide with/connect to anything). If you're testing active_constraints == nullptr you're effectively checking if any JPH::Constraints are active and this is unrelated to contact points.
The information so far was that we're crashing on contacts rather than on constraints, so I would expect that something is wrong with contacts_begin/contacts_end instead.
That's very good highlight on the contacts versus constraints. I noticed something quite interesting.
This is the function call:
mContactManager.WarmStartVelocityConstraints(contacts_begin, contacts_end, warm_start_impulse_ratio, steps_calculator);
Whereas the definition is:
void ContactConstraintManager::WarmStartVelocityConstraints(const uint32 *inConstraintIdxBegin, const uint32 *inConstraintIdxEnd, float inWarmStartImpulseRatio, MotionPropertiesCallback &ioCallback)
So there's the difference of contacts versus constraints used in the variable names.
I tried changing the variables to match in the call site, but that seems to disable all collisions.
I fixed a problem in the copy-pasted approach I tried where I had used the wrong variable. And now everything works with the copy-pasted exact same code of the method. I'm starting to think this is some kind of weird compiler bug with somehow the method call causing the issue rather than the actual code that is inside the method.
I was about to send this when you typed the last comment:
If you mean contacts_begin vs inConstraintIdxBegin then that's not strange. Contacts in the end are also constraints. ContactConstraintManager has its own array mConstraints of contact constraints that get created every frame and that are separate from ConstraintManager::mConstraints which contains instances of the Constraint class.
One more thing that you could try is to create a snapshot of the world every frame right before you call PhysicsSystem::Update:
https://github.com/jrouwe/JoltPhysics/blob/fd37495ad743949d0f68956064a8eda0de1d99d0/Samples/SamplesApp.cpp#L840-L851
and send me the resulting snapshot.bin from just before the crash. This does not save 100% of the state in the system, but does have a chance of giving me a repro case. Please do this on the master version of Jolt and tell me if you're using the double precision build or not.
but if the issue goes away after rearranging the code, then a compiler issue may be indeed what is causing this.
And another idea: You could not register your ContactListener / StepListener and BodyActivationListener to rule out the cases where you're changing state of the PhysicsSystem in an inappropriate way.
Thanks for explaining the contacts versus constraints thing, I was starting to suspect something like that was the case that it was safe to use the indices of one for the other.
I tried commenting out all the listener etc. registrations in my code:
// contactListener->SetNextListener(something);
// physicsSystem->SetContactListener(contactListener.get());
// Activation listening
activationListener = std::make_unique<BodyActivationListener>();
// physicsSystem->SetBodyActivationListener(activationListener.get());
stepListener = std::make_unique<StepListener>(*this);
// physicsSystem->AddStepListener(stepListener.get());
As that basically disables game movement, I suspected that would stop the crash, but that was not the case. Though it makes it take on average quite a bit longer to trigger than what the instant triggering was before.
I tried to look for the minimal set of changes needed to make the crash stop happening. I've reverted all the threading etc. experimental changes. And made just that one problematic method static. This seems to make everything fully work, even in release mode. Here's the minimal diff that I needed to apply to Jolt to make this work:
diff --git a/Jolt/Physics/Constraints/ContactConstraintManager.cpp b/Jolt/Physics/Constraints/ContactConstraintManager.cpp
index 687ab895..37104060 100644
--- a/Jolt/Physics/Constraints/ContactConstraintManager.cpp
+++ b/Jolt/Physics/Constraints/ContactConstraintManager.cpp
@@ -1527,13 +1527,13 @@ JPH_INLINE void ContactConstraintManager::sWarmStartConstraint(ContactConstraint
}
template <class MotionPropertiesCallback>
-void ContactConstraintManager::WarmStartVelocityConstraints(const uint32 *inConstraintIdxBegin, const uint32 *inConstraintIdxEnd, float inWarmStartImpulseRatio, MotionPropertiesCallback &ioCallback)
+void ContactConstraintManager::sWarmStartVelocityConstraints(ContactConstraint* const inConstraints, const uint32 *inConstraintIdxBegin, const uint32 *inConstraintIdxEnd, float inWarmStartImpulseRatio, MotionPropertiesCallback &ioCallback)
{
JPH_PROFILE_FUNCTION();
for (const uint32 *constraint_idx = inConstraintIdxBegin; constraint_idx < inConstraintIdxEnd; ++constraint_idx)
{
- ContactConstraint &constraint = mConstraints[*constraint_idx];
+ ContactConstraint &constraint = inConstraints[*constraint_idx];
// Fetch bodies
Body &body1 = *constraint.mBody1;
@@ -1571,8 +1571,8 @@ void ContactConstraintManager::WarmStartVelocityConstraints(const uint32 *inCons
}
// Specialize for the two body callback types
-template void ContactConstraintManager::WarmStartVelocityConstraints<CalculateSolverSteps>(const uint32 *inConstraintIdxBegin, const uint32 *inConstraintIdxEnd, float inWarmStartImpulseRatio, CalculateSolverSteps &ioCallback);
-template void ContactConstraintManager::WarmStartVelocityConstraints<DummyCalculateSolverSteps>(const uint32 *inConstraintIdxBegin, const uint32 *inConstraintIdxEnd, float inWarmStartImpulseRatio, DummyCalculateSolverSteps &ioCallback);
+template void ContactConstraintManager::sWarmStartVelocityConstraints<CalculateSolverSteps>(ContactConstraint* const inConstraints, const uint32 *inConstraintIdxBegin, const uint32 *inConstraintIdxEnd, float inWarmStartImpulseRatio, CalculateSolverSteps &ioCallback);
+template void ContactConstraintManager::sWarmStartVelocityConstraints<DummyCalculateSolverSteps>(ContactConstraint* const inConstraints, const uint32 *inConstraintIdxBegin, const uint32 *inConstraintIdxEnd, float inWarmStartImpulseRatio, DummyCalculateSolverSteps &ioCallback);
template <EMotionType Type1, EMotionType Type2>
JPH_INLINE bool ContactConstraintManager::sSolveVelocityConstraint(ContactConstraint &ioConstraint, MotionProperties *ioMotionProperties1, MotionProperties *ioMotionProperties2)
diff --git a/Jolt/Physics/Constraints/ContactConstraintManager.h b/Jolt/Physics/Constraints/ContactConstraintManager.h
index 635ce435..81b798a7 100644
--- a/Jolt/Physics/Constraints/ContactConstraintManager.h
+++ b/Jolt/Physics/Constraints/ContactConstraintManager.h
@@ -160,6 +160,8 @@ public:
/// Sort contact constraints deterministically
void SortContacts(uint32 *inConstraintIdxBegin, uint32 *inConstraintIdxEnd) const;
+ class ContactConstraint;
+
/// Get the affected bodies for a given constraint
inline void GetAffectedBodies(uint32 inConstraintIdx, const Body *&outBody1, const Body *&outBody2) const
{
@@ -170,7 +172,7 @@ public:
/// Apply last frame's impulses as an initial guess for this frame's impulses
template <class MotionPropertiesCallback>
- void WarmStartVelocityConstraints(const uint32 *inConstraintIdxBegin, const uint32 *inConstraintIdxEnd, float inWarmStartImpulseRatio, MotionPropertiesCallback &ioCallback);
+ static void sWarmStartVelocityConstraints(ContactConstraint* inConstraints, const uint32 *inConstraintIdxBegin, const uint32 *inConstraintIdxEnd, float inWarmStartImpulseRatio, MotionPropertiesCallback &ioCallback);
/// Solve velocity constraints, when almost nothing changes this should only apply very small impulses
/// since we're warm starting with the total impulse applied in the last frame above.
@@ -437,6 +439,7 @@ private:
using WorldContactPoints = StaticArray<WorldContactPoint, MaxContactPoints>;
+public:
/// Contact constraint class, used for solving penetrations
class ContactConstraint
{
@@ -479,6 +482,10 @@ public:
/// The maximum value that can be passed to Init for inMaxBodyPairs. Note you should really use a lower value, using this value will cost a lot of memory!
static constexpr uint cMaxBodyPairsLimit = ~uint(0) / sizeof(BodyPairMap::KeyValue);
+ ContactConstraint* GetConstraints() const
+ {
+ return mConstraints;
+ }
private:
/// Internal helper function to calculate the friction and non-penetration constraint properties. Templated to the motion type to reduce the amount of branches and calculations.
template <EMotionType Type1, EMotionType Type2>
diff --git a/Jolt/Physics/PhysicsSystem.cpp b/Jolt/Physics/PhysicsSystem.cpp
index 507ed4c1..93d72016 100644
--- a/Jolt/Physics/PhysicsSystem.cpp
+++ b/Jolt/Physics/PhysicsSystem.cpp
@@ -1345,7 +1345,7 @@ void PhysicsSystem::JobSolveVelocityConstraints(PhysicsUpdateContext *ioContext,
// Iteration 0 is used to warm start the batch (we added 1 to the number of iterations in LargeIslandSplitter::SplitIsland)
DummyCalculateSolverSteps dummy;
ConstraintManager::sWarmStartVelocityConstraints(active_constraints, constraints_begin, constraints_end, warm_start_impulse_ratio, dummy);
- mContactManager.WarmStartVelocityConstraints(contacts_begin, contacts_end, warm_start_impulse_ratio, dummy);
+ ContactConstraintManager::sWarmStartVelocityConstraints(mContactManager.GetConstraints(), contacts_begin, contacts_end, warm_start_impulse_ratio, dummy);
}
else
{
@@ -1429,7 +1429,7 @@ void PhysicsSystem::JobSolveVelocityConstraints(PhysicsUpdateContext *ioContext,
// We didn't create a split, just run the solver now for this entire island. Begin by warm starting.
ConstraintManager::sWarmStartVelocityConstraints(active_constraints, constraints_begin, constraints_end, warm_start_impulse_ratio, steps_calculator);
- mContactManager.WarmStartVelocityConstraints(contacts_begin, contacts_end, warm_start_impulse_ratio, steps_calculator);
+ ContactConstraintManager::sWarmStartVelocityConstraints(mContactManager.GetConstraints(), contacts_begin, contacts_end, warm_start_impulse_ratio, steps_calculator);
steps_calculator.Finalize();
// Store the number of position steps for later
It's not the cleanest change but for some reason it makes everything work. I let my game run for dozens of minutes with this version and there's no crash whereas without that small patch the crash happens within a few seconds at most. This fix also works in release mode as well.
I'm pretty sure this is some kind of weird compiler bug that can be worked around by tricking it to not do whatever it did with the method when it was not static.
Ok, that's pretty weird indeed.
What compiler version are you using? (and is it possible to update to a newer version?)
I tried to check multiple times that my XCode and command line tools are up to date. As far as I can tell 16 (as reported as the compiler version by cmake when configuring) is the latest version (which I have if I remember right, I turned off my Mac for today but I can double check tomorrow if I can find a way to get a more updated version of the AppleClang compiler).
Okay so surprisingly today my Mac showed me a XCode update that it did not want to give me 2 days ago no matter how many times I tried to check updates. So now I have the following versions of things:
-- The C compiler identification is AppleClang 17.0.0.17000013
-- The CXX compiler identification is AppleClang 17.0.0.17000013
Ninja 1.12.1
cmake 4.0.1
But sadly there is still the same issue as before so updating the compiler from version 16 to 17 does not seem to resolve this problem.
As one last thing I tried adding set(INTERPROCEDURAL_OPTIMIZATION OFF) before including the Jolt cmake files, but that didn't have an effect.
So ultimately I have found nothing else I can do to make things work except make that one function in Jolt static (and slightly adjust the surrounding code to fit). Even if it is required just to work around a very specific, probably a compiler, problem would a change like that be acceptable into Jolt?
-- The CXX compiler identification is AppleClang 17.0.0.17000013
I'm running the same version.
Even if it is required just to work around a very specific, probably a compiler, problem would a change like that be acceptable into Jolt?
I understand that it's a fairly small change, but I'd prefer to find the root cause.
I've been trying to repro your crash by using (what I think) are the same parameters as you to compile Jolt:
./cmake_linux_clang_gcc.sh Release clang++ -DDOUBLE_PRECISION=ON -DUSE_AVX=ON -DUSE_AVX2=ON -DUSE_F16C=ON -DUSE_LZCNT=ON -DUSE_TZCNT=ON -DUSE_FMADD=OFF -DUSE_SSE4_1=ON -DUSE_SSE4_2=ON -DUSE_AVX512=OFF -DCMAKE_POSITION_INDEPENDENT_CODE=ON -DCPP_RTTI_ENABLED=ON
(picked Release as I'm not sure which config you're testing with and copied settings from here)
So far this doesn't trigger the crash.
Would it be possible to give me the exact commandline you're using to compile and link the application (i.e. give me an verbose output of compiling both Jolt and linking the app so that I can see all commands)? That way I can compare if I'm compiling it in the exact same way.
And could you make a dump from right before the crash with master code using the SampleApp::TakeSnapshot code above?
Would it be possible to give me the exact commandline you're using to compile and link the application (i.e. give me an verbose output of compiling both Jolt and linking the app so that I can see all commands)? That way I can compare if I'm compiling it in the exact same way.
Sure. Here's one file and the final link command:
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++ -DJPH_DEBUG_RENDERER -DJPH_DOUBLE_PRECISION -DJPH_OBJECT_STREAM -DJPH_PROFILE_ENABLED -D_DEBUG -I/Users/hhyyrylainen/Projects/Thrive/build-debug -I/Users/hhyyrylainen/Projects/Thrive/build-debug/api -I/Users/hhyyrylainen/Projects/Thrive/third_party/JoltPhysics/Build/.. -Wall -Werror -g -frtti -fno-exceptions -ffp-model=precise -faligned-allocation -std=c++17 -arch arm64 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX15.4.sdk -fPIC -fcolor-diagnostics -pthread -Winvalid-pch -Xclang -include-pch -Xclang /Users/hhyyrylainen/Projects/Thrive/build-debug/third_party/JoltPhysics/Build/CMakeFiles/Jolt.dir/cmake_pch.hxx.pch -Xclang -include -Xclang /Users/hhyyrylainen/Projects/Thrive/build-debug/third_party/JoltPhysics/Build/CMakeFiles/Jolt.dir/cmake_pch.hxx -MD -MT third_party/JoltPhysics/Build/CMakeFiles/Jolt.dir/__/Jolt/Core/Factory.cpp.o -MF third_party/JoltPhysics/Build/CMakeFiles/Jolt.dir/__/Jolt/Core/Factory.cpp.o.d -o third_party/JoltPhysics/Build/CMakeFiles/Jolt.dir/__/Jolt/Core/Factory.cpp.o -c /Users/hhyyrylainen/Projects/Thrive/third_party/JoltPhysics/Jolt/Core/Factory.cpp
[1143/1167] : && /opt/homebrew/bin/cmake -E rm -f third_party/JoltPhysics/Build/libJolt_without_avx.a && /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ar qc third_party/JoltPhysics/Build/libJolt_without_avx.a third_party/JoltPhysics/Build/CMakeFiles/Jolt.dir/__/Jolt/AABBTree/AABBTreeBuilder.cpp.o EXCLUDED A BUNCH OF .o FILES HERE && /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib third_party/JoltPhysics/Build/libJolt_without_avx.a && /opt/homebrew/bin/cmake -E touch third_party/JoltPhysics/Build/libJolt_without_avx.a &&
[1159/1167] : && /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++ -g -arch arm64 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX15.4.sdk -dynamiclib -Wl,-headerpad_max_install_names -pthread -o src/native/libthrive_native_without_avx.dylib -install_name @rpath/libthrive_native_without_avx.dylib src/native/CMakeFiles/thrive_native.dir/interop/CInterop.cpp.o src/native/CMakeFiles/thrive_native.dir/core/Logger.cpp.o src/native/CMakeFiles/thrive_native.dir/core/TaskSystem.cpp.o src/native/CMakeFiles/thrive_native.dir/physics/BodyActivationListener.cpp.o src/native/CMakeFiles/thrive_native.dir/physics/ContactListener.cpp.o src/native/CMakeFiles/thrive_native.dir/physics/PhysicalWorld.cpp.o src/native/CMakeFiles/thrive_native.dir/physics/PhysicsBody.cpp.o src/native/CMakeFiles/thrive_native.dir/physics/ShapeCreator.cpp.o src/native/CMakeFiles/thrive_native.dir/physics/ShapeWrapper.cpp.o src/native/CMakeFiles/thrive_native.dir/physics/SimpleShapes.cpp.o src/native/CMakeFiles/thrive_native.dir/physics/TrackedConstraint.cpp.o src/native/CMakeFiles/thrive_native.dir/physics/StepListener.cpp.o src/native/CMakeFiles/thrive_native.dir/physics/DebugDrawForwarder.cpp.o src/native/CMakeFiles/thrive_native.dir/shared/IntercommunicationManager.cpp.o -Wl,-rpath,"\$ORIGIN" third_party/JoltPhysics/Build/libJolt_without_avx.a
I'll attach a zip with the full build log. build_log.txt.zip
I'll try to create such a dump by putting that dump call just before the crashing method call. I assume I'll need to modify the Jolt cmake stuff to make the Jolt library link to the SampleApp project to make the code compile if I use such a snapshot method from there?
Here's the snapshot:
I'm not sure how I was supposed to do it but I was able to copy-paste that TakeSnapshot method and then make it work by disabling asserts and safety checks in the locks system to allow the dump while inside the velocity constraints solve function call. Just after that dump was made the game ran into that crash so that should be the exact physics state captured that was being processed in a way that leads to a crash.
I noticed a few new commits so I pulled in the latest master (2dcab94cbc97e57747aa96b4f7060ffedaa2dbc3) before making the dump.
Yes, copy pasting was the thing to do. And I would have put it before calling PhysicsSystem::Update to avoid locking issues. But I'll take a look if I can use the current dump. Thanks!
It takes about a second or so for the crash to happen so the dump from earlier wouldn't necessarily have the problem yet. Though now that I think about it, I could have probably made it so that earlier dumps are overwritten until the process crashes.
I compiled with the same compiler and settings as you (./cmake_linux_clang_gcc.sh Debug clang++ -DDOUBLE_PRECISION=ON -DCMAKE_POSITION_INDEPENDENT_CODE=ON -DCPP_RTTI_ENABLED=ON) and loaded the snapshot. Unfortunately all objects are static in that snapshot (presumably because it's the first frame), so nothing happens and it doesn't crash. Could you create a dump from the last frame before the crash (indeed overwriting dumps every frame)?
What I'm surprised about is that you have built a Debug build without any optimization settings. I've never seen a code gen issue in debug builds before (and have had many due to the optimizer).
The snapshot was from the physics step when the crash happened. But I'll move the dump outside the physics update in case me force hacking through the locks that prevented the reading of the data initially was not the best idea.
Here's a new new dump that is from just before the physics update ( physicsSystem->Update(time, collisionStepsPerUpdate, tempAllocator.get(), &jobExecutor)) that causes the crash.
I've tried to use this latest snapshot, and I see some things moving now, but still no crash. I compared the disassembly of JPH::ContactConstraintManager::WarmStartVelocityConstraints<JPH::CalculateSolverSteps> around the crashing address with yours and except for some register renames, they appear identical. So I have no idea what this is.
I suggest that you keep your workaround and that we leave this open for a while to see if someone else also runs into the same issue.
For now I created a fork with the workaround applied: https://github.com/Revolutionary-Games/JoltPhysics/tree/mac_workaround
I ran into the same issue on Mac Mini M4 pro tested in JoltPhysicsSharp, seems in the native libraries update in this commit https://github.com/amerkoleci/JoltPhysicsSharp/commit/c5ca685a18976793d71452432e7b336832d27eb6 , while before this update everything works well.
Dumping the backtrace. Please include this when reporting the bug to the project developer.
[1] invoke_previous_action(sigaction*, int, __siginfo*, void*, bool)
[2] 2 libsystem_platform.dylib 0x0000000195844624 _sigtramp + 56
[3] JPH::PhysicsSystem::JobSolveVelocityConstraints(JPH::PhysicsUpdateContext*, JPH::PhysicsUpdateContext::Step*)
[4] std::__1::__function::__func<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12, std::__1::allocator<JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)::$_12>, void ()>::operator()()
[5] JPH::JobSystemWithBarrier::BarrierImpl::Wait()
[6] JPH::PhysicsSystem::Update(float, int, JPH::TempAllocator*, JPH::JobSystem*)
[7] 7 ??? 0x000000030e5d67b8 0x0 + 13125904312
[8] 8 ??? 0x000000030e31726c 0x0 + 13123023468
[9] 9 ??? 0x000000030e314468 0x0 + 13123011688
[10] 10 ??? 0x000000030e314154 0x0 + 13123010900
[11] 11 ??? 0x000000030e313f8c 0x0 + 13123010444
[12] 12 ??? 0x000000030e313e38 0x0 + 13123010104