box86 icon indicating copy to clipboard operation
box86 copied to clipboard

Killing Floor Dedicated Server Crashes with SIGSEGV

Open tiagojofran opened this issue 1 year ago • 2 comments

Hi, @ptitSeb. I'm running a Killing Floor dedicated server on my OCI ARM instance, and it is working really well for most of the time. Unfortunately, though, I'm observing a few problems when players vote to choose a few specific maps among the available options. When some of these maps are chosen and the game tries to load them, the server either crashes with:

Apr 16 06:37:21 oci-server ucc-bin[2707396]: Bringing Level KF-Hellride.myLevel up for play (30) appSeconds: 242.091332...
Apr 16 06:37:21 oci-server ucc-bin[2707396]: (Karma): StaticMesh (signs22) with empty Karma KAggregateGeometry.
Apr 16 06:37:21 oci-server ucc-bin[2707396]: (Karma): StaticMesh (signs22) with empty Karma KAggregateGeometry.
Apr 16 06:37:21 oci-server ucc-bin[2707396]: Signal: SIGSEGV [segmentation fault]
Apr 16 06:37:21 oci-server ucc-bin[2707396]: Aborting.

Or it just hangs with:

Apr 16 06:49:17 oci-server ucc-bin[2708564]: Bringing Level KF-Hellride.myLevel up for play (30) appSeconds: 705.360812...
Apr 16 06:49:17 oci-server ucc-bin[2708564]: (Karma): StaticMesh (signs22) with empty Karma KAggregateGeometry.
Apr 16 06:49:17 oci-server ucc-bin[2708564]: (Karma): StaticMesh (signs22) with empty Karma KAggregateGeometry.
Apr 16 06:49:17 oci-server ucc-bin[2708564]: free(): corrupted unsorted chunks

And then it needs to be manually restarted thereafter.

Box86 logs only show these entries:

argv[1]="server"
argv[2]="KF-Farm.rom?game=KFmod.KFGameType?VACSecured=true?MaxPlayers=6"
argv[3]="-nohomedir"
Rename process to "ucc-bin-real"
Using native(wrapped) libdl.so.2
Using native(wrapped) libc.so.6
Using native(wrapped) ld-linux.so.2
Using native(wrapped) libpthread.so.0
Using native(wrapped) librt.so.1
Using native(wrapped) libbsd.so.0
Using emulated libsteam_api.so
Using emulated libstdc++.so.6
Using native(wrapped) libm.so.6
Using emulated /lib/i386-linux-gnu/libgcc_s.so.1
Warning: call to partially implemented dl_iterate_phdr(0x60415d50, 0xf523e6e8)
Warning: call to partially implemented dl_iterate_phdr(0x60415d50, 0xf523e7e8)
Warning: call to partially implemented dl_iterate_phdr(0x60415d50, 0xf523e7e8)
Warning: call to partially implemented dl_iterate_phdr(0x60415d50, 0xf523e7e8)
Warning: call to partially implemented dl_iterate_phdr(0x60415d50, 0xf523e6f8)
...
Using emulated steamclient.so
Warning: Weak Symbol _ZGTtnaj not found, cannot apply R_386_JMP_SLOT 0x6de624b8 (0x125fc6)
Error loading needed lib crashhandler.so
Warning: Cannot dlopen("crashhandler.so"/0x6deeb3b8, 2)
Error loading needed lib libsteam.so
Warning: Cannot dlopen("libsteam.so"/0x6000cf4f, 2)
Error loading needed lib libsteam.so
Warning: Cannot dlopen("libsteam.so"/0x6000cf4f, 2)
Error loading needed lib libSDL3.so.0
Warning: Cannot dlopen("libSDL3.so.0"/0x6d494714, 2)
Warning: call to partially implemented dl_iterate_phdr(0x60415d50, 0xf5235d58)
Warning: call to partially implemented dl_iterate_phdr(0x60415d50, 0xf5235e58)
...

I was suspecting that the map files were just corrupted, but after redownloading and validating them with steamcmd, the problem still persists.
Could this possibly be a box86 issue? Should I apply and test any environment variables for this server? Please, let me know if you need me to provide additional logs. Also, I'd like to thank you and your contributors for these amazing projects, if it weren't for box86/64, our OCI instances wouldn't be so amazingly versatile.

tiagojofran avatar Apr 17 '24 00:04 tiagojofran

try to get ome box log. Use those env. var. to launch the server: BOX86_LOG=1 BOX86_SHOWSEGV=1 BOX86_SHOWBT=1 BOX86_TRACE_FILE=logs.txt that will generate logs in "logs.txt", and will print the segfault.

You can also try to run the server with BOX86_DYNAREC_STRONGMEM=1 in case this is a multi-threading issue (it can be, because it seems pseudo random).

ptitSeb avatar Apr 17 '24 07:04 ptitSeb

I applied the variables as requested, including BOX86_DYNAREC_STRONGMEM=1, which didn't seem to help. This issue really seems to be random, I had to force the crash a few times to reproduce the segfault, because the "hanging" without a segfault occurs more often. The logs I've obtained were pretty much identical as before, the difference were these lines, which I believe are describing the segfault:

3197510|SIGSEGV @0x6294900c (???(./ucc-bin-real/0x6294900c)) (x86pc=0x4004007b/???:"???", esp=0xea7fc04c, stack=0xea000000:0xea800000 own=(nil) fp=0xea7fc078), for accessing 0x1b (code=1/prot=0), db=(nil)((ni>ESP-0x10:0x36f00000 ESP-0x0c:0xea7fc06c ESP-0x08:0x65b538a4 ESP-0x04:0xea7fc078
ESP+0x00:0x08057315 ESP+0x04:0x65b53314 ESP+0x08:0x65b53028 ESP+0x0c:0xea7fc088
Native bactrace:

The logs just end there.

Also, the server logs showed these after different crashes:

Apr 17 08:26:45 oci-server ucc-bin[3197510]: malloc(): invalid size (unsorted)

Apr 17 08:48:08 oci-server ucc-bin[3203276]: corrupted size vs. prev_size

Are these of any help?

tiagojofran avatar Apr 17 '24 09:04 tiagojofran

Somehow, the issue appears to have been solved with the latest versions of box86. I'm closing this issue now as there is nothing else to add to it.

tiagojofran avatar Jun 03 '24 08:06 tiagojofran

Glad to see it's fixed!

ptitSeb avatar Jun 03 '24 09:06 ptitSeb