ppsspp icon indicating copy to clipboard operation
ppsspp copied to clipboard

1.17 beta crash mystery thread

Open hrydgard opened this issue 1 year ago • 15 comments

Now with the beta program, we can do these before the release instead of after! There's enough beta testers already (I think because I once enabled registration in the past without actually having any builds) that we get a usable amount of crash reports.

I've fixed a bunch of low hanging fruit, here come the tricky ones.

First, an oldie but a goodie assert, I really don't understand this one, it should not be possible for b.originalAddress to be null in FinalizeBlock. Though on the other hande, block number should be equal to b.num, no? weird.

(JitBlockCache.cpp:FinalizeBlock:250): [Memory::IsValidAddress(b.originalAddress)] (ULUS10543 WWE SmackDown vs. RAW 2011, 1884.2s) FinalizeBlock: Bad originalAddress 00000000 in block 107438 (b.num: 41902) proxy: n sz: 12

  #00  pc 0x0000000000038880  /apex/com.android.runtime/lib/bionic/libc.so (abort+172)
  #01  pc 0x00000000003fef8d  /apex/com.android.art/lib/libart.so (art::Runtime::Abort(char const*)+1768)
  #02  pc 0x000000000000d97f  /system/lib/libbase.so (android::base::SetAborter(std::__1::function<void (char const*)>&&)::$_3::__invoke(char const*)+46)
  #03  pc 0x00000000000052eb  /system/lib/liblog.so (__android_log_assert+174)
  #04  pc 0x000000000064f92f  arm/libppsspp_jni.so (HandleAssert(char const*, char const*, int, char const*, char const*, ...)+194) (BuildId: 12d5e65fd2137fa97249f8b4a259e9608afa3c7f)
  #05  pc 0x0000000000360f8f  arm/libppsspp_jni.so (JitBlockCache::FinalizeBlock(int, bool)+286) (BuildId: 12d5e65fd2137fa97249f8b4a259e9608afa3c7f)
  #06  pc 0x000000000034b7e9  arm/libppsspp_jni.so (MIPSComp::ArmJit::Compile(unsigned int)+192) (BuildId: 12d5e65fd2137fa97249f8b4a259e9608afa3c7f)
  #07  pc 0x0000000000000106 

Next, there's another oldie I've seen before but never figured out. Might simply be some kind of memory corruption, but I think we can add some checks.

SIGSEGV
  #01  pc 0x00000000003796c5  arm/libppsspp_jni.so (CoreTiming::Advance()+136)

This crashes here:

void ProcessFifoWaitEvents()
{
	while (first)
	{
		if (first->time <= (s64)GetTicks())
		{
//			LOG(CPU, "[Scheduler] %s		 (%lld, %lld) ",
//				first->name ? first->name : "?", (u64)GetTicks(), (u64)first->time);
			Event* evt = first;
			first = first->next;
/////////////////////// THE BELOW LINE CRASHES /////////////////////
			event_types[evt->type].callback(evt->userdata, (int)(GetTicks() - evt->time));
			FreeEvent(evt);
		}
		else
		{
			break;
		}
	}
}

So I suppose evt->type might have gotten corrupted?

(A curiosity here is how the name of the function has survived from pre-open-source Dolphin, which I took the original timing system from.. there is no fifo :) )

hrydgard avatar Jan 15 '24 21:01 hrydgard

There are also a couple of shutdown hangs. Believe I've solved one already related to ManagedTexture, but here's one where EmuThread is stuck somewhere in __NetShutdown, unfortunately the stack is missing some detail:

  #02  pc 0x0000000000fd623c  arm64/libppsspp_jni.so (std::__ndk1::thread::join()+28) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #03  pc 0x00000000005d219c  arm64/libppsspp_jni.so (__NetShutdown()+92) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #04  pc 0x00000000005854cc  arm64/libppsspp_jni.so (__KernelShutdown()+376) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #05  pc 0x000000000067d5ec  arm64/libppsspp_jni.so (CPU_Shutdown()+148) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #06  pc 0x000000000067ded8  arm64/libppsspp_jni.so (PSP_Shutdown()+184) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #07  pc 0x0000000000895c0c  arm64/libppsspp_jni.so (EmuScreen::~EmuScreen()+136) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #08  pc 0x0000000000895db8  arm64/libppsspp_jni.so (EmuScreen::~EmuScreen()+16) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)

There are a number of threads that are joined by various functions called by __NetShutdown, seems one of them is stuck. It seems to be the upnp thread:

  #00  pc 0x00000000000e1c1c  /apex/com.android.runtime/lib64/bionic/libc.so (__ppoll+12)
  #01  pc 0x000000000009a36c  /apex/com.android.runtime/lib64/bionic/libc.so (poll+96)
  #02  pc 0x0000000000db1688  arm64/libppsspp_jni.so (receivedata+84) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #03  pc 0x0000000000db0394  arm64/libppsspp_jni.so (getHTTPResponse+204) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #04  pc 0x0000000000daf39c  arm64/libppsspp_jni.so (simpleUPnPcommand+632) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #05  pc 0x0000000000db2080  arm64/libppsspp_jni.so (UPNP_AddPortMapping+292) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #06  pc 0x0000000000688138  arm64/libppsspp_jni.so (PortManager::Add(char const*, unsigned short, unsigned short)+1404) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #07  pc 0x000000000068907c  arm64/libppsspp_jni.so (upnpService(unsigned int)+376) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #08  pc 0x000000000068b9a4  arm64/libppsspp_jni.so (void* std::__ndk1::__thread_proxy<std::__ndk1::tuple<std::__ndk1::unique_ptr<std::__ndk1::__thread_struct, std::__ndk1::default_delete<std::__ndk1::__thread_struct>>, int (*)(unsigned int), unsigned int>>(void*)+48) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)

hrydgard avatar Jan 15 '24 22:01 hrydgard

And here's another one where it appears stuck in vsnprintf, or perhaps more likely the exception is getting triggered over and over:

  #00  pc 0x00000000000e0710  /apex/com.android.runtime/lib64/bionic/libc.so (__sfvwrite+224)
  #01  pc 0x00000000000d64d8  /apex/com.android.runtime/lib64/bionic/libc.so (__vfprintf+9688)
  #02  pc 0x00000000000f5ea0  /apex/com.android.runtime/lib64/bionic/libc.so (vsnprintf+192)
  #03  pc 0x00000000000bafec  /apex/com.android.runtime/lib64/bionic/libc.so (__vsnprintf_chk+60)
  #04  pc 0x000000000049a4f0  arm64/libppsspp_jni.so (snprintf(char*, unsigned long pass_object_size1, char const*, ...)) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #05  pc 0x0000000000498c28  arm64/libppsspp_jni.so (Arm64Dis(unsigned long, unsigned int, char*, int, bool, bool (*)(char*, int, unsigned char*))+2892) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #06  pc 0x00000000004a8378  arm64/libppsspp_jni.so (DisassembleArm64(unsigned char const*, int)+380) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #07  pc 0x00000000006617bc  arm64/libppsspp_jni.so (Memory::HandleFault(unsigned long, void*)+512) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #08  pc 0x000000000085b878  arm64/libppsspp_jni.so (sigsegv_handler(int, siginfo*, void*)) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)

The snprintf is just this one:

line 331
} else if (index_pre) {
	snprintf(instr->text, sizeof(instr->text), "%s%s%s %c%d, [x%d, #%d]!", opname[opc], signExt, sizeSuffix[size], r, Rt, Rn, SignExtend9(imm9));

which is from a loadstore, which checks out.

hrydgard avatar Jan 15 '24 22:01 hrydgard

Additional hang, DrainAndBlockCompileQueue vs CompileThread seem to have a possible deadlock:

  #01  pc 0x000000000008dab4  /apex/com.android.runtime/lib64/bionic/libc.so (__futex_wait_ex_owner(void volatile*, bool, int, bool, timespec const*, unsigned int)+432)
  #02  pc 0x00000000000f51f0  /apex/com.android.runtime/lib64/bionic/libc.so (NonPI::MutexLockWithTimeout(pthread_mutex_internal_t*, bool, timespec const*)+252)
  #03  pc 0x0000000000fcdc38  arm64/libppsspp_jni.so (std::__ndk1::mutex::lock()+8) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #04  pc 0x000000000082c934  arm64/libppsspp_jni.so (VulkanRenderManager::CompileThreadFunc()+156) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #05  pc 0x000000000083271c  arm64/libppsspp_jni.so (void* std::__ndk1::__thread_proxy<std::__ndk1::tuple<std::__ndk1::unique_ptr<std::__ndk1::__thread_struct, std::__ndk1::default_delete<std::__ndk1::__thread_struct>>, void (VulkanRenderManager::*)(), VulkanRenderManager*>>(void*)+64) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #03  pc 0x0000000000f93640  /data/app/~~y7ZUpix72C30XvPkv_BYKw==/org.ppsspp.ppsspp-QKHiP-DL2wluIBuv1ZMg_w==/lib/arm64/libppsspp_jni.so (std::__ndk1::condition_variable::wait(std::__ndk1::unique_lock<std::__ndk1::mutex>&)+20) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #04  pc 0x000000000082d770  /data/app/~~y7ZUpix72C30XvPkv_BYKw==/org.ppsspp.ppsspp-QKHiP-DL2wluIBuv1ZMg_w==/lib/arm64/libppsspp_jni.so (VulkanRenderManager::DrainAndBlockCompileQueue()+136) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #05  pc 0x00000000006aa4b8  arm64/libppsspp_jni.so (GPU_Vulkan::DeviceLost()+60) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #06  pc 0x00000000008780e0  arm64/libppsspp_jni.so (NativeShutdownGraphics()+104) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #07  pc 0x0000000000871a00  arm64/libppsspp_jni.so (VulkanEmuThread(ANativeWindow*)) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)

hrydgard avatar Jan 15 '24 22:01 hrydgard

There are a number of threads that are joined by various functions called by __NetShutdown, seems one of them is stuck. It seems to be the upnp thread:

Hmm.. the connection to the router might be stalled or have problem, thus it waited until timeout (i set the default timeout to 2000 ms as there are slow routers that need at least 1 second to be detected). This kind of issue shouldn't be consistent or easily reproduced, otherwise an actual bug existed.

anr2me avatar Jan 16 '24 01:01 anr2me

Yeah, likely it's some kind of one-off - I only see a single report of this. I tried to see if I could find a path where there would be more than 1 timeout between two checks of the thread-exit variable, but couldn't find such a path, so not convinced there's anything we can do about it...

hrydgard avatar Jan 16 '24 08:01 hrydgard

Beta 2 from now on.

The shutdown race condition still doesn't seem completely cured, and I got a shutdown hang I haven't seen before:

  #03  pc 0x0000000000f958c0  arm64/libppsspp_jni.so (std::__ndk1::condition_variable::wait(std::__ndk1::unique_lock<std::__ndk1::mutex>&)+20) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #04  pc 0x0000000000858e18  arm64/libppsspp_jni.so (WaitableCounter::Wait()+76) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #05  pc 0x0000000000858a58  arm64/libppsspp_jni.so (ParallelRangeLoop(ThreadManager*, std::__ndk1::function<void (int, int)> const&, int, int, int, TaskPriority)+124) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #06  pc 0x000000000073686c  arm64/libppsspp_jni.so (GPURecord::mymemmem(unsigned char const*, unsigned long, unsigned long, unsigned char const*, unsigned long, unsigned long)) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #07  pc 0x0000000000734ca0  arm64/libppsspp_jni.so (GPURecord::EmitCommandWithRAM(GPURecord::CommandType, void const*, unsigned int, unsigned int)) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #08  pc 0x0000000000734608  arm64/libppsspp_jni.so (GPURecord::NotifyCommand(unsigned int)+1348) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #09  pc 0x000000000073c078  arm64/libppsspp_jni.so (GPUCommon::SlowRunLoop(DisplayList&)+244) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #10  pc 0x000000000073be24  arm64/libppsspp_jni.so (GPUCommon::InterpretList(DisplayList&)+644) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #11  pc 0x000000000073b32c  arm64/libppsspp_jni.so (GPUCommon::ProcessDLQueue()+100) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #12  pc 0x000000000073b1b0  arm64/libppsspp_jni.so (GPUCommon::EnqueueList(unsigned int, unsigned int, int, PSPPointer<PspGeListArgs>, bool)+1852) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #13  pc 0x00000000005681e4  arm64/libppsspp_jni.so (void WrapU_UUIU<&sceGeListEnQueue(unsigned int, unsigned int, int, unsigned int)>()+60) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #14  pc 0x0000000000542a68  arm64/libppsspp_jni.so (CallSyscallWithoutFlags(HLEFunction const*)+52) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)

Some pool worker stacks:

  #00  pc 0x0000000000070080  /apex/com.android.runtime/lib64/bionic/libc.so (je_tcache_bin_flush_small)
  #01  pc 0x0000000000064ae4  /apex/com.android.runtime/lib64/bionic/libc.so (ifree+720)
  #02  pc 0x0000000000064ce4  /apex/com.android.runtime/lib64/bionic/libc.so (je_free+112)
  #03  pc 0x0000000000858ee8  /data/app/~~5Qw_xZval2LqhbpV-oJ_qg==/org.ppsspp.ppsspp-hUUVp3_A9lz8K6QmRWQQ0g==/lib/arm64/libppsspp_jni.so (LoopRangeTask::~LoopRangeTask()+72) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #04  pc 0x000000000085a0f8  /data/app/~~5Qw_xZval2LqhbpV-oJ_qg==/org.ppsspp.ppsspp-hUUVp3_A9lz8K6QmRWQQ0g==/lib/arm64/libppsspp_jni.so (WorkerThreadFunc(GlobalThreadContext*, TaskThreadContext*)) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #05  pc 0x000000000085bbf8  /data/app/~~5Qw_xZval2LqhbpV-oJ_qg==/org.ppsspp.ppsspp-hUUVp3_A9lz8K6QmRWQQ0g==/lib/arm64/libppsspp_jni.so (void* std::__ndk1::__thread_proxy<std::__ndk1::tuple<std::__ndk1::unique_ptr<std::__ndk1::__thread_struct, std::__ndk1::default_delete<std::__ndk1::__thread_struct>>, void (*)(GlobalThreadContext*, TaskThreadContext*), GlobalThreadContext*, TaskThreadContext*>>(void*)+48) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #00  pc 0x0000000000064cc4  /apex/com.android.runtime/lib64/bionic/libc.so (je_free+80)
  #01  pc 0x0000000000858ee8  /data/app/~~5Qw_xZval2LqhbpV-oJ_qg==/org.ppsspp.ppsspp-hUUVp3_A9lz8K6QmRWQQ0g==/lib/arm64/libppsspp_jni.so (LoopRangeTask::~LoopRangeTask()+72) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #02  pc 0x000000000085a0f8  /data/app/~~5Qw_xZval2LqhbpV-oJ_qg==/org.ppsspp.ppsspp-hUUVp3_A9lz8K6QmRWQQ0g==/lib/arm64/libppsspp_jni.so (WorkerThreadFunc(GlobalThreadContext*, TaskThreadContext*)) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #03  pc 0x000000000085bbf8  /data/app/~~5Qw_xZval2LqhbpV-oJ_qg==/org.ppsspp.ppsspp-hUUVp3_A9lz8K6QmRWQQ0g==/lib/arm64/libppsspp_jni.so (void* std::__ndk1::__thread_proxy<std::__ndk1::tuple<std::__ndk1::unique_ptr<std::__ndk1::__thread_struct, std::__ndk1::default_delete<std::__ndk1::__thread_struct>>, void (*)(GlobalThreadContext*, TaskThreadContext*), GlobalThreadContext*, TaskThreadContext*>>(void*)+48) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
#00  pc 0x0000000000075dc0  /apex/com.android.runtime/lib64/bionic/libc.so (__memchr_aarch64)
  #01  pc 0x0000000000736a54  /data/app/~~5Qw_xZval2LqhbpV-oJ_qg==/org.ppsspp.ppsspp-hUUVp3_A9lz8K6QmRWQQ0g==/lib/arm64/libppsspp_jni.so (std::__ndk1::__function::__func<GPURecord::mymemmem(unsigned char const*, unsigned long, unsigned long, unsigned char const*, unsigned long, unsigned long)::$_0, std::__ndk1::allocator<GPURecord::mymemmem(unsigned char const*, unsigned long, unsigned long, unsigned char const*, unsigned long, unsigned long)::$_0>, void (int, int)>::operator()(int&&, int&&)) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #02  pc 0x0000000000858f50  /data/app/~~5Qw_xZval2LqhbpV-oJ_qg==/org.ppsspp.ppsspp-hUUVp3_A9lz8K6QmRWQQ0g==/lib/arm64/libppsspp_jni.so (LoopRangeTask::Run()+68) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #03  pc 0x000000000085a0e8  /data/app/~~5Qw_xZval2LqhbpV-oJ_qg==/org.ppsspp.ppsspp-hUUVp3_A9lz8K6QmRWQQ0g==/lib/arm64/libppsspp_jni.so (WorkerThreadFunc(GlobalThreadContext*, TaskThreadContext*)) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #04  pc 0x000000000085bbf8  /data/app/~~5Qw_xZval2LqhbpV-oJ_qg==/org.ppsspp.ppsspp-hUUVp3_A9lz8K6QmRWQQ0g==/lib/arm64/libppsspp_jni.so (void* std::__ndk1::__thread_proxy<std::__ndk1::tuple<std::__ndk1::unique_ptr<std::__ndk1::__thread_struct, std::__ndk1::default_delete<std::__ndk1::__thread_struct>>, void (*)(GlobalThreadContext*, TaskThreadContext*), GlobalThreadContext*, TaskThreadContext*>>(void*)+48) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #00  pc 0x0000000000070080  /apex/com.android.runtime/lib64/bionic/libc.so (je_tcache_bin_flush_small)
  #01  pc 0x0000000000064ae4  /apex/com.android.runtime/lib64/bionic/libc.so (ifree+720)
  #02  pc 0x0000000000064ce4  /apex/com.android.runtime/lib64/bionic/libc.so (je_free+112)
  #03  pc 0x0000000000858ee8  /data/app/~~5Qw_xZval2LqhbpV-oJ_qg==/org.ppsspp.ppsspp-hUUVp3_A9lz8K6QmRWQQ0g==/lib/arm64/libppsspp_jni.so (LoopRangeTask::~LoopRangeTask()+72) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #04  pc 0x000000000085a0f8  /data/app/~~5Qw_xZval2LqhbpV-oJ_qg==/org.ppsspp.ppsspp-hUUVp3_A9lz8K6QmRWQQ0g==/lib/arm64/libppsspp_jni.so (WorkerThreadFunc(GlobalThreadContext*, TaskThreadContext*)) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #05  pc 0x000000000085bbf8  /data/app/~~5Qw_xZval2LqhbpV-oJ_qg==/org.ppsspp.ppsspp-hUUVp3_A9lz8K6QmRWQQ0g==/lib/arm64/libppsspp_jni.so (void* std::__ndk1::__thread_proxy<std::__ndk1::tuple<std::__ndk1::unique_ptr<std::__ndk1::__thread_struct, std::__ndk1::default_delete<std::__ndk1::__thread_struct>>, void (*)(GlobalThreadContext*, TaskThreadContext*), GlobalThreadContext*, TaskThreadContext*>>(void*)+48) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)

Weird stuff, almost like there's a hang in the memory allocator (jemalloc) ?

Or it's just stuck performing the same thing over and over somehow..

hrydgard avatar Jan 17 '24 08:01 hrydgard

Another thread hang, interesting:

  #00  pc 0x00000000000881b0  /apex/com.android.runtime/lib64/bionic/libc.so (syscall+32)
  #01  pc 0x00000000000f1660  /apex/com.android.runtime/lib64/bionic/libc.so (pthread_join+268)
  #02  pc 0x0000000000fd623c  arm64/libppsspp_jni.so (std::__ndk1::thread::join()+28) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #03  pc 0x000000000082d04c  arm64/libppsspp_jni.so (VulkanRenderManager::StopThread()+236) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #04  pc 0x000000000082d2ac  arm64/libppsspp_jni.so (VulkanRenderManager::DestroyBackbuffers()+16) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #05  pc 0x0000000000873db4  arm64/libppsspp_jni.so (AndroidVulkanContext::Resize()+96) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #06  pc 0x0000000000878754  arm64/libppsspp_jni.so (NativeFrame(GraphicsContext*)+1480) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #07  pc 0x0000000000871900  arm64/libppsspp_jni.so (VulkanEmuThread(ANativeWindow*)) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #08  pc 0x000000000087317c  arm64/libppsspp_jni.so (void* std::__ndk1::__thread_proxy<std::__ndk1::tuple<std::__ndk1::unique_ptr<std::__ndk1::__thread_struct, std::__ndk1::default_delete<std::__ndk1::__thread_struct>>, void (*)(ANativeWindow*), ANativeWindow*>>(void*)+44) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)

vs

  #00  pc 0x00000000000881b0  /apex/com.android.runtime/lib64/bionic/libc.so (syscall+32)
  #01  pc 0x000000000008ca7c  /apex/com.android.runtime/lib64/bionic/libc.so (__futex_wait_ex+148)
  #02  pc 0x00000000000eff60  /apex/com.android.runtime/lib64/bionic/libc.so (pthread_cond_wait+84)
  #03  pc 0x0000000000f93640  arm64/libppsspp_jni.so (std::__ndk1::condition_variable::wait(std::__ndk1::unique_lock<std::__ndk1::mutex>&)+20) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #04  pc 0x000000000082b354  arm64/libppsspp_jni.so (Promise<VkPipeline_T*>::BlockUntilReady()+112) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #05  pc 0x000000000083655c  arm64/libppsspp_jni.so (VulkanQueueRunner::PerformRenderPass(VKRStep const&, VkCommandBuffer_T*, int, QueueProfileContext&)+2384) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #06  pc 0x0000000000835920  arm64/libppsspp_jni.so (VulkanQueueRunner::RunSteps(std::__ndk1::vector<VKRStep*, std::__ndk1::allocator<VKRStep*>>&, int, FrameData&, FrameDataShared&, bool)+524) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #07  pc 0x000000000082dadc  arm64/libppsspp_jni.so (VulkanRenderManager::Run(VKRRenderThreadTask&)+732) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #08  pc 0x000000000082c744  arm64/libppsspp_jni.so (VulkanRenderManager::ThreadFunc()+240) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #09  pc 0x000000000083271c  arm64/libppsspp_jni.so (void* std::__ndk1::__thread_proxy<std::__ndk1::tuple<std::__ndk1::unique_ptr<std::__ndk1::__thread_struct, std::__ndk1::default_delete<std::__ndk1::__thread_struct>>, void (VulkanRenderManager::*)(), VulkanRenderManager*>>(void*)+64) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)

I think this will be fixed by one of my upcoming changes.

hrydgard avatar Jan 17 '24 12:01 hrydgard

Report from beta 1:

  #00  pc 0x000000000004ed98  /apex/com.android.runtime/lib64/bionic/libc.so (__memcpy+232)
  #01  pc 0x00000000007215e0  arm64/libppsspp_jni.so (TextureReplacer::NotifyTextureDecoded(ReplacedTexture*, ReplacedTextureDecodeInfo const&, void const*, int, int, int, int, int, int)+1332) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #02  pc 0x00000000006b5718  arm64/libppsspp_jni.so (TextureCacheVulkan::BuildTexture(TexCacheEntry*)+3184) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #03  pc 0x000000000070bb30  arm64/libppsspp_jni.so (TextureCacheCommon::ApplyTexture()+468) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #04  pc 0x00000000006a7608  arm64/libppsspp_jni.so (DrawEngineVulkan::DoFlush()+1572) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #05  pc 0x000000000073eba4  arm64/libppsspp_jni.so (GPUCommonHW::FastRunLoop(DisplayList&)+272) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #06  pc 0x00000000007385fc  arm64/libppsspp_jni.so (GPUCommon::InterpretList(DisplayList&)+608) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #07  pc 0x0000000000737b28  arm64/libppsspp_jni.so (GPUCommon::ProcessDLQueue()+100) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #08  pc 0x00000000007379ac  arm64/libppsspp_jni.so (GPUCommon::EnqueueList(unsigned int, unsigned int, int, PSPPointer<PspGeListArgs>, bool)+1852) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #09  pc 0x0000000000568f1c  arm64/libppsspp_jni.so (void WrapU_UUIU<&sceGeListEnQueue(unsigned int, unsigned int, int, unsigned int)>()+60) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)

Also beta 1:

  #00  pc 0x0000000000664008 arm64/libppsspp_jni.so (Memory::Write_U32(unsigned int, unsigned int)+124) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #01  pc 0x00000000004c6e04 arm64/libppsspp_jni.so (CWCheatEngine::ExecuteOp(CheatOperation const&, CheatCode const&, unsigned long&)+4348) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #02  pc 0x00000000004c46dc arm64/libppsspp_jni.so (CWCheatEngine::Run()+176) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #03  pc 0x00000000004c4000 arm64/libppsspp_jni.so (hleCheat(unsigned long long, int)+788) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #04  pc 0x00000000004c1738 arm64/libppsspp_jni.so (CoreTiming::Advance()+140) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)

hrydgard avatar Jan 18 '24 11:01 hrydgard

beta 3:

  #04  pc 0x0000000000fd3bc0  arm64/libppsspp_jni.so (std::__ndk1::mutex::lock()+8) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
  #05  pc 0x00000000008a4970  arm64/libppsspp_jni.so (GameInfo::GetTitle()+20) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
  #06  pc 0x00000000008ca6d8  arm64/libppsspp_jni.so (GameScreen::CreateViews()+1792) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
  #07  pc 0x0000000000dd0824  arm64/libppsspp_jni.so (UIScreen::DoRecreateViews()+180) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
  #08  pc 0x0000000000dd1120  arm64/libppsspp_jni.so (UIScreen::render(ScreenRenderMode)+192) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
  #09  pc 0x00000000008cdafc  arm64/libppsspp_jni.so (GameScreen::render(ScreenRenderMode)+48) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
  #10  pc 0x0000000000dcfafc  arm64/libppsspp_jni.so (ScreenManager::render()+732) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
  #11  pc 0x0000000000880000  arm64/libppsspp_jni.so (NativeFrame(GraphicsContext*)+796) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
  #12  pc 0x0000000000879470  arm64/libppsspp_jni.so (VulkanEmuThread(ANativeWindow*)) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)

Maybe just slowness in scoped storage land, a work thread has the following top of a stack (but missing the rest):

  #00  pc 0x00000000000979dc  /apex/com.android.runtime/lib64/bionic/libc.so (syscall+28)
  #01  pc 0x00000000003a8af4  /apex/com.android.art/lib64/libart.so (art::ConditionVariable::WaitHoldingLocks(art::Thread*)+140)
  #02  pc 0x000000000077f71c  /apex/com.android.art/lib64/libart.so (artJniMethodEnd+204)
  #03  pc 0x000000000020facc  /apex/com.android.art/lib64/libart.so (art_jni_method_end+12)
  at android.os.BinderProxy.transactNative (Native method)
  at android.os.BinderProxy.transact (BinderProxy.java:678)
  at android.content.ContentProviderProxy.query (ContentProviderNative.java:479)
  at android.content.ContentResolver.query (ContentResolver.java:1245)
  at android.content.ContentResolver.query (ContentResolver.java:1171)
  at android.content.ContentResolver.query (ContentResolver.java:1127)
  at org.ppsspp.ppsspp.PpssppActivity.listContentUriDir (PpssppActivity.java:275)

hrydgard avatar Jan 18 '24 18:01 hrydgard

#00  pc 0x000000000059c290  arm64/libppsspp_jni.so (SceKernelVplHeader::Allocate(unsigned int)+104) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
#01  pc 0x0000000000597d74  arm64/libppsspp_jni.so (__KernelAllocateVpl(int, unsigned int, unsigned int, unsigned int&, bool, char const*)) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
#02  pc 0x0000000000598290  arm64/libppsspp_jni.so (sceKernelTryAllocateVpl(int, unsigned int, unsigned int)+44) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
#03  pc 0x0000000000588634  arm64/libppsspp_jni.so (void WrapI_IUU<&sceKernelTryAllocateVpl(int, unsigned int, unsigned int)>()+32) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
#04  pc 0x0000000000544208  arm64/libppsspp_jni.so (CallSyscallWithoutFlags(HLEFunction const*)+52) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)

This crashes here:

	PSPPointer<SceKernelVplBlock> SplitBlock(PSPPointer<SceKernelVplBlock> b, u32 allocBlocks) {
		u32 prev = b.ptr;
		b->sizeInBlocks -= allocBlocks;

		b += b->sizeInBlocks;
		b->sizeInBlocks = allocBlocks;   // << CRASH HERE
		b->next = prev;

		return b;
	}

Suspicious... Probably the block header got corrupted.

Another:

  #00  pc 0x000000000070fab8  arm64/libppsspp_jni.so (TextureCacheCommon::InvalidateAll(GPUInvalidationType)+124) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
  #01  pc 0x00000000007409cc  arm64/libppsspp_jni.so (GPUCommonHW::InvalidateCache(unsigned int, int, GPUInvalidationType)+72) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
  #02  pc 0x0000000000586994  arm64/libppsspp_jni.so (sceKernelDcacheWritebackAll()+40) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
  #03  pc 0x000000000054c2ec  arm64/libppsspp_jni.so (void WrapI_V<&sceKernelDcacheWritebackAll()>()+8) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
  #04  pc 0x0000000000544208  arm64/libppsspp_jni.so (CallSyscallWithoutFlags(HLEFunction const*)+52) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)

Crash here:

	for (TexCache::iterator iter = cache_.begin(), end = cache_.end(); iter != end; ++iter) {
		if (iter->second->GetHashStatus() == TexCacheEntry::STATUS_RELIABLE) {
			iter->second->SetHashStatus(TexCacheEntry::STATUS_HASHING);
		}
		iter->second->invalidHint++;
	}

hrydgard avatar Jan 20 '24 22:01 hrydgard

Crash in libpng, not good. libpng17 seems unmaintained, no updates since 2017 :(

This is on line libpng17/pngread.c:1312

backtrace:
  #00  pc 0x000000000004eed4  /apex/com.android.runtime/lib64/bionic/libc.so (__memcpy+292)
  #01  pc 0x0000000000e06488  arm64/libppsspp_jni.so (png_image_memory_read) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #02  pc 0x0000000000e1230c  arm64/libppsspp_jni.so (png_crc_read+36) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #03  pc 0x0000000000e03720  arm64/libppsspp_jni.so (png_read_IDAT) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #04  pc 0x0000000000e03578  arm64/libppsspp_jni.so (png_read_row+252) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #05  pc 0x0000000000e06334  arm64/libppsspp_jni.so (png_image_read_direct) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #06  pc 0x0000000000e02920  arm64/libppsspp_jni.so (png_safe_execute+112) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #07  pc 0x0000000000e046b8  arm64/libppsspp_jni.so (png_image_finish_read+352) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #08  pc 0x00000000007f99b4  arm64/libppsspp_jni.so (pngLoadPtr(unsigned char const*, unsigned long, int*, int*, unsigned char**)+168) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #09  pc 0x000000000086dc1c  arm64/libppsspp_jni.so (TempImage::LoadTextureLevelsFromFileData(unsigned char const*, unsigned long, ImageFileType)+460) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #10  pc 0x000000000086dfb0  arm64/libppsspp_jni.so (CreateTextureFromFileData(Draw::DrawContext*, unsigned char const*, unsigned long, ImageFileType, bool, char const*)+128) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #11  pc 0x00000000008a1580  arm64/libppsspp_jni.so (GameInfoCache::SetupTexture(std::__ndk1::shared_ptr<GameInfo>&, Draw::DrawContext*, GameInfoTex&)+188) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #12  pc 0x00000000008a1168  arm64/libppsspp_jni.so (GameInfoCache::GetInfo(Draw::DrawContext*, Path const&, int)+272) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #13  pc 0x00000000008b10d4  arm64/libppsspp_jni.so (MainScreen::DrawBackgroundFor(UIContext&, Path const&, float)+104) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #14  pc 0x00000000008b0fe0  arm64/libppsspp_jni.so (MainScreen::DrawBackground(UIContext&)+108) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #15  pc 0x0000000000dcd678  arm64/libppsspp_jni.so (UIScreen::render(ScreenRenderMode)+300) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #16  pc 0x0000000000dcbffc  arm64/libppsspp_jni.so (ScreenManager::render()+732) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #17  pc 0x000000000087c718  arm64/libppsspp_jni.so (NativeFrame(GraphicsContext*)+852) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #18  pc 0x0000000000873b10  arm64/libppsspp_jni.so (UpdateRunLoopAndroid(_JNIEnv*)+36) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #19  pc 0x0000000000876fb4  arm64/libppsspp_jni.so (EmuThreadFunc()) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #20  pc 0x00000000004d650c  arm64/libppsspp_jni.so (void* std::__ndk1::__thread_proxy<std::__ndk1::tuple<std::__ndk1::unique_ptr<std::__ndk1::__thread_struct, std::__ndk1::default_delete<std::__ndk1::__thread_struct>>, void (*)()>>(void*)+44) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)

hrydgard avatar Jan 22 '24 21:01 hrydgard

Maybe change other png library ? https://github.com/pnggroup/libpng

sum2012 avatar Jan 28 '24 05:01 sum2012

yes, I'm thinking of trying spng instead.

https://libspng.org/

hrydgard avatar Jan 28 '24 09:01 hrydgard

Although, I now think it's really due to a data loading race condition in GameInfoCache... Ugh.

I tried spng, and it's pretty nice, just lacking the ability to specify a byte stride in encode/decode. So will probably switch to it later anyway since it's faster, but not for the 1.17 series.

hrydgard avatar Jan 28 '24 10:01 hrydgard

Pretty sure I've solved the png loading crash now.

Here's a savestate load-from-rewind problem, hm:

  #00  pc 0x0000000000690f54  arm64/libppsspp_jni.so (BlockAllocator::Free(unsigned int)+24) (BuildId: 7918a182fa2ad427379f7705c2f0bcb662ac20f4)
  #01  pc 0x000000000059b37c  arm64/libppsspp_jni.so (PartitionMemoryBlock::~PartitionMemoryBlock()+48) (BuildId: 7918a182fa2ad427379f7705c2f0bcb662ac20f4)
  #02  pc 0x0000000000585fb8  arm64/libppsspp_jni.so (KernelObjectPool::DoState(PointerWrap&)+212) (BuildId: 7918a182fa2ad427379f7705c2f0bcb662ac20f4)
  #03  pc 0x0000000000585c58  arm64/libppsspp_jni.so (__KernelDoState(PointerWrap&)+108) (BuildId: 7918a182fa2ad427379f7705c2f0bcb662ac20f4)
  #04  pc 0x0000000000674f78  arm64/libppsspp_jni.so (SaveState::SaveStart::DoState(PointerWrap&)+572) (BuildId: 7918a182fa2ad427379f7705c2f0bcb662ac20f4)
  #05  pc 0x00000000006747f4  arm64/libppsspp_jni.so (CChunkFileReader::Error CChunkFileReader::LoadPtr<SaveState::SaveStart>(unsigned char*, SaveState::SaveStart&, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>>*)+88) (BuildId: 7918a182fa2ad427379f7705c2f0bcb662ac20f4)
  #06  pc 0x000000000067975c  arm64/libppsspp_jni.so (SaveState::StateRingbuffer::Restore(std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>>*)+200) (BuildId: 7918a182fa2ad427379f7705c2f0bcb662ac20f4)
  #07  pc 0x000000000067960c  arm64/libppsspp_jni.so (SaveState::HandleLoadFailure()+100) (BuildId: 7918a182fa2ad427379f7705c2f0bcb662ac20f4)
  #08  pc 0x0000000000679f9c  arm64/libppsspp_jni.so (SaveState::Process()+1244) (BuildId: 7918a182fa2ad427379f7705c2f0bcb662ac20f4)
  #09  pc 0x000000000067fbc4  arm64/libppsspp_jni.so (PSP_RunLoopWhileState()+148) (BuildId: 7918a182fa2ad427379f7705c2f0bcb662ac20f4)
  #10  pc 0x00000000008a16ec  arm64/libppsspp_jni.so (EmuScreen::render(ScreenRenderMode)+1056) (BuildId: 7918a182fa2ad427379f7705c2f0bcb662ac20f4)

hrydgard avatar Jan 28 '24 22:01 hrydgard