sourcepawn icon indicating copy to clipboard operation
sourcepawn copied to clipboard

Async JIT execution leads to crashes

Open Wend4r opened this issue 3 years ago • 7 comments

Hello, I would like to see support in SourcePawn for executing JIT OP-Code in several threads at the same time.

For example, when using sv_parallel_send <numthreads> that executes SetTransmit in different threads (and in SourceMod), server does crashes image

I also need my upcoming PR to SourceMod with async

native void Call_StartFunction(Handle plugin, Function func, bool async = false);

(to make huge hierarchical miscalculations with the Navigation Meshes of the map without loading the main game thread (tick delays in milliseconds/aka server var) :P)

Wend4r avatar Aug 06 '22 07:08 Wend4r

  sub_320880(
    &dword_76F380,
    "sv_parallel_send",
    "0",
    0x80000,
    "Pack and send snapshots in parallel for smoother server tick rate at the expense of spending more CPU.",
    sub_1AF5F0);

sv_parallel_send description also talks about unloading the main game thread

Wend4r avatar Aug 06 '22 07:08 Wend4r

Also I remember that IMemAlloc::Alloc, IMemAlloc::Realloc and IMemAlloc::Free can be called from different threads that can hooks SourceMod DHooks. I had to rewrite everything to SourceMod C++ Extension 😠

Wend4r avatar Aug 06 '22 07:08 Wend4r

Example crash details

Stack trace
Function Offset
sourcepawn.jit.x86.so!sp::PoolScope::PoolScope() 0x2a
sourcepawn.jit.x86.so!sp::CompilerBase::CompilerBase(sp::PluginRuntime*, sp::MethodInfo*) 0x42
sourcepawn.jit.x86.so!sp::Compiler::Compiler(sp::PluginRuntime*, sp::MethodInfo*) 0x17
sourcepawn.jit.x86.so!sp::CompilerBase::Compile(sp::PluginContext*, ke::RefPtrsp::MethodInfo, int*) 0x2d
sourcepawn.jit.x86.so!sp::Environment::Invoke(sp::PluginContext*, ke::RefPtrsp::MethodInfo const&, int*) 0x1ae
sourcepawn.jit.x86.so!sp::PluginContext::Invoke(unsigned int, int const*, unsigned int, int*) 0x208
sourcepawn.jit.x86.so!sp::ScriptedInvoker::Invoke(int*) 0x260
sourcepawn.jit.x86.so!sp::ScriptedInvoker::Execute(int*) 0x5f
sourcemod.logic.so!CForward::Execute(int*, SourceMod::IForwardFilter*) 0x1d4
Frame context
Thread 7 (crashed):
   0: sourcepawn.jit.x86.so!sp::PoolScope::PoolScope() + 0x2a
      eip: 0xf07745ca  esp: 0xe2bfdbf0  ebp: 0xe2bfdc08  ebx: 0x0fbc5f20
      esi: 0xe2bfdc7c  edi: 0x00000000  eax: 0x00000000  ecx: 0xe2bffbd4
      edx: 0x00000000  efl: 0x00010246  

      f07745b9  74 0d           jz 0xf07745c8
      f07745bb  a1 4c e7 7c f0  mov eax, [0xf07ce74c]
      f07745c0  89 04 24        mov [esp], eax
      f07745c3  e8 48 22 73 07  call 0xf7ea6810
      f07745c8  89 06           mov [esi], eax
  >   f07745ca  ff 40 08        inc dword [eax+0x8]
      f07745cd  8b 40 04        mov eax, [eax+0x4]
      f07745d0  85 c0           test eax, eax
      f07745d2  74 03           jz 0xf07745d7
      f07745d4  8b 78 04        mov edi, [eax+0x4]
      f07745d7  89 7e 04        mov [esi+0x4], edi

      e2bfdbf0  01 00 00 00 77 00 00 00  84 f8 04 08 a2 b0 c0 f7  ....w...........
      e2bfdc00  68 dc bf e2 00 1b 40 d6  28 dc bf e2 12 64 77 f0  h.....@.(....dw.

      Found via instruction pointer in context

sourcepawn.jit.x86.zip

Wend4r avatar Aug 06 '22 08:08 Wend4r

There is no thread safety model for SourcePawn data structures, and it'd be very difficult to define and implement one efficiently. And I'm opposed to a global interpreter lock, which defeats the point of threading. So really the only path forward is an isolation model, where separate threads do not share any resources except those that can be sent over pipes. Think something like Web Workers.

Even if we build that out, SourceMod is completely thread unsafe. Every single native would have to be audited, and the set of natives available would have to be restricted off the main thread. Otherwise it'd be a free-for-all and we'd get bug reports every day about how random thing X crashes 10% of the time.

tl;dr: a "proper" solution is not coming anytime soon. Instead, I think spot-fixes are needed for safety here. Anywhere a SourceHook callback could be called off the main thread, we should check and bypass interactions with main-thread plugins (and use mutexes if main thread data structures are at risk).

Your solution of using an extension or MM:S plugin is the right way, IMO.

dvander avatar Aug 06 '22 22:08 dvander

Another case with asynchronous SDKHook_SetTransmit

Stack trace
Function
0 libc-2.31.so + 0x14dd90
1 sourcepawn.jit.x86.so!sp::ScriptedInvoker::Invoke(int*) + 0x1a7
2 sourcepawn.jit.x86.so!sp::ScriptedInvoker::Execute(int*) + 0x5f
3 sdkhooks.ext.2.csgo.so!SDKHooks::Call(CBaseEntity*, SDKHookType, CBaseEntity*) + 0x104
4 sdkhooks.ext.2.csgo.so!SDKHooks::Hook_SetTransmit(CCheckTransmitInfo*, bool) + 0x55
5 sdkhooks.ext.2.csgo.so!__SourceHook_MFHCls_SetTransmit::CMyDelegateImpl::Call(CCheckTransmitInfo*, bool) + 0x2d
6 sdkhooks.ext.2.csgo.so!__SourceHook_MFHCls_SetTransmit::Func(CCheckTransmitInfo*, bool) + 0x115
7 server.so + 0x6d66c3
8 engine.so + 0x1cc4a3
9 engine.so + 0x1d4aba
10 engine.so + 0x1d4d19
11 engine.so + 0x1d56bc
12 engine.so + 0x1d4db5
13 engine.so + 0x1d4df1
14 engine.so + 0x1d4e77
15 libvstdlib.so + 0x1a679
16 libtier0.so!CThread::ThreadProc(void*) + 0x97
17 libpthread-2.31.so!start_thread [pthread_create.c:477 + 0x3]
18 libc-2.31.so!__clone + 0x66
19 0xc68ffb40

Wend4r avatar Sep 12 '22 07:09 Wend4r

Reporting stacks is not helping anything. We've always known SP doesn't support running concurrently.

Fyren avatar Sep 12 '22 12:09 Fyren

This is an sdkhooks bug not SP.

dvander avatar Sep 12 '22 12:09 dvander