box64 icon indicating copy to clipboard operation
box64 copied to clipboard

RimWorld 1.2/1.3 - Random crashes

Open KweezyCode opened this issue 3 years ago • 68 comments

Mono path[0] = '/home/kweezy/Desktop/RimWorld/RimWorldLinux_Data/Managed' Mono config path = '/home/kweezy/Desktop/RimWorld/RimWorldLinux_Data/MonoBleedingEdge/etc' 0xa04860a0: Stack is corrupted, aborting ==== CPU Registers ==== RAX=207119f200bbe6c3 RCX=0000000000000000 RDX=0000ffffb393d440 RBX=0000ffffb393d3b0 RSP=0000ffffb393d398 FLAGS=--S--PC RBP=0000ffffb393d448 RSI=0000ffffb393d448 RDI=0000ffffb393d3b0 R8=0000000000000001 R9=0000000000000010 R10=0000000000000000 R11=0000000000000000 R12=0000ffffb393d440 R13=00000000a2cde790 R14=0000ffffb393d498 R15=00000000a2cde780 RIP=00000000a04a6913 ======== Stack ======== Stack is from FFFFB393D448 to FFFFB393D398 Old IP: A04860A0 LIBGL: Shuting down 59973|0xa04a9000: Unimplemented Opcode (00) F0 1F EA A0 00 00 00 00 90 72 49 A0 00 00 00

KweezyCode avatar Sep 04 '21 23:09 KweezyCode

Same for me, on a Pi4 8GB with Ubuntu MATE.

dinchamion avatar Sep 29 '21 16:09 dinchamion

Same for me, on a Pi4 8GB with Ubuntu MATE.

as i see, developer ignores just this issue, or i must be blind

KweezyCode avatar Oct 04 '21 13:10 KweezyCode

The opocde liste is an invalid one. The issue os not the opecode, but the "Stack is corrupted" message that happened before.

I have fixed some the stack corruption error, but not all it seems.

ptitSeb avatar Oct 04 '21 13:10 ptitSeb

problem still exists, OS: Ubuntu 21.04 Desktop dump.txt dump1.txt

KweezyCode avatar Oct 04 '21 14:10 KweezyCode

Ok Relevent part on dump1 is

29907|0xffff969ffd5a: Calling my_pthread_mutex_init (/usr/local/bin/box64)(0xFFFF96D4E960, 0x0, 0x1, ...) => return 0x0
29907|0xffff96a16db8: Calling __libc_calloc (/lib/aarch64-linux-gnu/libc.so.6)(0x1, 0x10, 0x1, ...) => return 0x8C61CCF0
29907|0xffff96a0eba8: Calling sem_init (/lib/aarch64-linux-gnu/libpthread.so.0)(0x8C61AC30, 0x0, 0x0, ...) => return 0x0
29907|0xffff96a0e8f9: Calling pthread_setspecific (/lib/aarch64-linux-gnu/libpthread.so.0)(0x19, 0x8C61ABD0, 0x0, ...) => return 0x0
29907|0xffff96a11c41: Calling my_pthread_attr_init (/usr/local/bin/box64)(0xFFFFB03AB3E0, 0xFFFFB03AB470, 0x0, ...) => return 0x0
29907|0xffff96a11c7b: Calling pthread_self (/lib/aarch64-linux-gnu/libc.so.6)(0xFFFFB03AB3E0, 0xFFFFB03AB470, 0x0, ...) => return 0xFFFFB08BB870
29907|0xffff96a11c86: Calling pthread_getattr_np (/lib/aarch64-linux-gnu/libc.so.6)(0xFFFFB08BB870, 0xFFFFB03AB3E0, 0x0, ...) => return 0x0
29907|0xffff96a11cc9: Calling my_pthread_attr_getstack (/usr/local/bin/box64)(0xFFFFB03AB3E0, 0xFFFFB03AB478, 0xFFFFB03AB470, ...) => return 0x0
29907|0xffff96a11d06: Calling my_pthread_attr_destroy (/usr/local/bin/box64)(0xFFFFB03AB3E0, 0xFFFFB03AB478, 0xFFFFB03AB470, ...) => return 0x0
29907|0xffff96a11d50: Calling my___stack_chk_fail (/usr/local/bin/box64)(0xFFFFB03AB3E0, 0xFFFFB03AB478, 0xFFFFB03AB470, ...) =>0x8a0350a0: Stack is corrupted, aborting
==== CPU Registers ====
RAX=10d5e6b23c171300 RCX=0000ffff96a0e8dd RDX=0000ffffb03ab470 RBX=0000ffffb03ab3e0 RSP=0000ffffb03ab3c8 FLAGS=--S--PC
RBP=0000ffffb03ab478 RSI=0000ffffb03ab478 RDI=0000ffffb03ab3e0  R8=0000000000000001  R9=0000000000000010 
R10=0000000000000000 R11=0000000000000000 R12=0000ffffb03ab470 R13=000000008c60dc70 R14=0000ffffb03ab4c8 
R15=000000008c60dc60 RIP=000000008a055913 
======== Stack ========
Stack is from FFFFB03AB478 to FFFFB03AB3C8
Old IP: 8A0350A0
 return 0x10D5E6B23C171300
 return 0x0
End of X86 run (0x8a036130), RIP=0xffff96a11d50, Stack=0xffffb03ab3d0, unimp=0, emu->fork=0, emu->uc_link=(nil), emu->quit=1

Seems to be related to the my_pthread_attr_getstack(...) call. I'll check later.

ptitSeb avatar Oct 04 '21 14:10 ptitSeb

This is probably fixed by #135. Could you try building a fresh box64 and testing again?

mogery avatar Oct 17 '21 21:10 mogery

now i see black window that instantly closes dump.txt dumpfromgame.txt (about 38mb): https://ufile.io/c2denjit

KweezyCode avatar Oct 18 '21 18:10 KweezyCode

You need to run the game with the MESA_GL_VERSION_OVERRIDE=3.2 environment variable and the -force-glcore32 command line argument.

mogery avatar Oct 18 '21 18:10 mogery

wow, i see game loading screen but then it crashes with Aborted (core dumped) :( do you need log?

KweezyCode avatar Oct 18 '21 18:10 KweezyCode

do you need log?

Yup. STDOUT, Player.log, and a reading from cat /proc/$(pgrep RimWorld)/maps while the loading screen is present.

mogery avatar Oct 18 '21 19:10 mogery

Same as @KweezyPlay, it starts up, loads for a while, and then crashes. dinchamion@tranquility-pi:~/apps/rimworld/data/noarch/game$ MESA_GL_VERSION_OVERRIDE=3.2 ./start_RimWorld_openglfix.sh Dynarec for ARM64, with extension: ASIMD CRC32 PageSize:4096 Box64 with Dynarec v0.1.5 36bfcd2 built on Oct 18 2021 19:06:15 Using default BOX64_LD_LIBRARY_PATH: ./:lib/:lib64/:x86_64/:bin64/:libs64/ Using default BOX64_PATH: ./:bin/ Counted 64 Env var Looking for ./RimWorldLinux argv[1]="-logfile" argv[2]="/tmp/rimworld_log_1000" Using emulated UnityPlayer.so Using native(wrapped) libm.so.6 Using emulated /lib/x86_64-linux-gnu/libgcc_s.so.1 Using native(wrapped) libpthread.so.0 Using native(wrapped) libc.so.6 Using native(wrapped) ld-linux-x86-64.so.2 Using native(wrapped) librt.so.1 Using native(wrapped) libdl.so.2 Set current directory to /home/dinchamion/apps/rimworld/data/noarch/game Found path: /home/dinchamion/apps/rimworld/data/noarch/game/RimWorldLinux Using native(wrapped) libX11.so.6 Using native(wrapped) libXext.so.6 Using native(wrapped) libxcb.so.1 Using native(wrapped) libXau.so.6 Using native(wrapped) libXdmcp.so.6 Using native(wrapped) libXcursor.so.1 Using native(wrapped) libXinerama.so.1 Using native(wrapped) libXi.so.6 Using native(wrapped) libXrandr.so.2 Using native(wrapped) libXrender.so.1 Using native(wrapped) libXss.so.1 Using native(wrapped) libXxf86vm.so.1 Using native(wrapped) libGL.so.1 Using native(wrapped) libudev.so.1 ./start_RimWorld_openglfix.sh: line 39: 13181 Aborted (core dumped) ./$GAMEFILE $LOG $FORCEGL $@

and

dinchamion@tranquility-pi:~/apps/box64/build$ cat /proc/$(pgrep RimWorld)/maps cat: /proc/13168: Is a directory cat: 13181/maps: No such file or directory

Log: rimworld_log_1000.txt

Don't know how to do STDOUT. :-/

dinchamion avatar Oct 18 '21 19:10 dinchamion

for no reason, when i used BOX64_LOG=2 it loaded. let me test it for stability.

KweezyCode avatar Oct 18 '21 19:10 KweezyCode

P_20211018_222347_1

KweezyCode avatar Oct 18 '21 19:10 KweezyCode

for no reason, when i used BOX64_LOG=2 it loaded. let me test it for stability.

Please try it a couple times again. This might be memory mapping related.

mogery avatar Oct 18 '21 19:10 mogery

Have been trying it a few times. Got it running twice, but otherwise crash:

RimWorld 1.3.3117 rev664 Error loading needed lib /home/dinchamion/apps/rimworld/data/noarch/game/RimWorldLinux_Data/Managed/System.Xml.Linq.dll.so Warning: Cannot dlopen("/home/dinchamion/apps/rimworld/data/noarch/game/RimWorldLinux_Data/Managed/System.Xml.Linq.dll.so"/0x8605c320, 101) Error loading needed lib /home/dinchamion/apps/rimworld/data/noarch/game/RimWorldLinux_Data/Managed/mono/aot-cache/amd64/System.Xml.Linq.dll.so Warning: Cannot dlopen("/home/dinchamion/apps/rimworld/data/noarch/game/RimWorldLinux_Data/Managed/mono/aot-cache/amd64/System.Xml.Linq.dll.so"/0x86c2e9a0, 101) double free or corruption (fasttop) 18495|0x64952d8a: Unimplemented Opcode (DF) FE 97 60 02 40 F9 5C DD FE 97 E3 FF FF 17 1F

rimworld-crash.txt

Using BOX64_LOG=2 outright makes it hang without even getting to the loading screen. Both with and without the Royalty files placed in the Data directory. (I don't have the latest expansion yet.)

Gonna try to test it a bit more, and if I luck into making it run, will try to get some logs.

Screenshot at 2021-10-18 20-02-45 Screenshot at 2021-10-18 20-13-57

dinchamion avatar Oct 18 '21 20:10 dinchamion

Using BOX64_LOG=2 outright makes it hang without even getting to the loading screen.

I'm pretty sure it doesn't hang, preload just gets slowed down by all the writing, so you're gonna have to wait a couple minutes. It'd be awesome if you could get me a Player.log with LOG=2.

mogery avatar Oct 18 '21 20:10 mogery

So it's probably a timing issue if detailled logs makes it run.

ptitSeb avatar Oct 18 '21 21:10 ptitSeb

I'm pretty sure it doesn't hang, preload just gets slowed down by all the writing, so you're gonna have to wait a couple minutes. It'd be awesome if you could get me a Player.log with LOG=2.

You're right there, I've let it run for about an hour or so, and it did get to the loading screen. I had to kill it, but could perhaps let it run as long as it takes... it did however seem to be hanging there at that point. (There's an ellipsis loading animation, which was frozen.)

Thing is, Player.log is 1.7GB. :-D

dinchamion@tranquility-pi:~/apps/box64/build$ ls -l /home/dinchamion/.config/unity3d/Ludeon\ Studios/RimWorld\ by\ Ludeon\ Studios/ total 1631204 drwxrwxr-x 2 dinchamion dinchamion 4096 Oct 18 20:13 Config -rw-rw-r-- 1 dinchamion dinchamion 1670331479 Oct 18 21:52 Player.log -rw-rw-r-- 1 dinchamion dinchamion 841 Oct 18 21:41 prefs drwxrwxr-x 2 dinchamion dinchamion 4096 Oct 18 20:14 Saves drwxrwxr-x 2 dinchamion dinchamion 4096 Oct 18 20:02 Scenarios

Not sure how large it'd grow if I just let it run for hours. I have space, but not infinite. :-) And also above a certain size I would have an issue uploading it to someplace.

I grabbed the tail after killing it:

dinchamion@tranquility-pi:~/apps/box64/build$ tail /home/dinchamion/.config/unity3d/Ludeon\ Studios/RimWorld\ by\ Ludeon\ Studios/Player.log 20228|0xfffeecaf6d3b: Calling __sched_yield (/lib/aarch64-linux-gnu/libc.so.6)(0xFFFE27DE7080, 0x0, 0xFFFEECE1FD68, ...) => return 0x0 20228|0xfffeecaf6d3b: Calling __sched_yield (/lib/aarch64-linux-gnu/libc.so.6)(0xFFFE27DE7080, 0x0, 0xFFFEECE1FD68, ...) => return 0x0 20228|0xfffeecaf6d3b: Calling __sched_yield (/lib/aarch64-linux-gnu/libc.so.6)(0xFFFE27DE7080, 0x0, 0xFFFEECE1FD68, ...) => return 0x0 20228|0xfffeecaf6d3b: Calling __sched_yield (/lib/aarch64-linux-gnu/libc.so.6)(0xFFFE27DE7080, 0x0, 0xFFFEECE1FD68, ...) => return 0x0 20228|0xfffeecaf6d3b: Calling __sched_yield (/lib/aarch64-linux-gnu/libc.so.6)(0xFFFE27DE7080, 0x0, 0xFFFEECE1FD68, ...) => return 0x0 20228|0xfffeecaf6d3b: Calling __sched_yield (/lib/aarch64-linux-gnu/libc.so.6)(0xFFFE27DE7080, 0x0, 0xFFFEECE1FD68, ...) => return 0x0 20228|0xfffeecaf6d67: Calling nanosleep (/lib/aarch64-linux-gnu/libc.so.6)(0xFFFE25C5A4C0, 0x0, 0xFFFEECE1FD68, ...) =>20226|0xfffeecaf1415: Calling pthread_self (/lib/aarch64-linux-gnu/libc.so.6)(0x0, 0xA, 0xA2, ...) => return 0xFFFE263820C0 20226|0xfffeecaf148b: Calling my_pthread_kill (/usr/local/bin/box64)(0xFFFFA7DB0870, 0x1E, 0xA2, ...) => return 0x0 Sigactionhanlder for signal #30 called (jump to 0xfffeecaf5ac2/???(0xfffeecaf5ac2)) 20004|signal function handler 0xfffeecaf5ac2 called, RSP=0xffffa6fa78a8

Maybe you see something from this. Let me know if you want me to run it longer, or upload this 1.7gig log to Dropbox, or something else.

I'd love to figure this out. I miss playing Rimworld.

dinchamion avatar Oct 18 '21 22:10 dinchamion

about an hour or so

Hmm. Ok. That's an issue for trying over and over again. I'll think about how to approach this.

mogery avatar Oct 19 '21 04:10 mogery

Ok, so I only care about logs if the game crashes, but the game only crashed after the long waiting for preload.

I'll try to get some RimWorld binaries and test on my own time.

@dinchamion What version Ubuntu MATE are you using?

mogery avatar Oct 19 '21 06:10 mogery

i got it running only once, can't run game anymore for no reason. i am using Ubuntu 21 (the latest). I can send you binaries

KweezyCode avatar Oct 19 '21 06:10 KweezyCode

i got it running only once, can't run game anymore for no reason. i am using Ubuntu 21 (the latest). I can send you binaries

Please send Player.log with LOG=3 and a memory map. I'll try to get binaries a legit way. We'll see.

mogery avatar Oct 19 '21 08:10 mogery

Current testing status:

  • starting with no log setting or 1: game launches and crashes on the loading screen (couldn't yet get it running again)
  • with log level 2 or 3 it creates a 1.6+gb log file for both Player.log and /tmp/rimworld_log_1000 and just hangs there pretty indefinitely. CPU and memory load is negligible, so my guess is that it freezes and doesn't do anything. (But I'm not sure. I'll let it run for a couple of hours later today.)

I've uploaded a player.log from the crash (with no loglevel) and the "idle" state I killed after a while (with loglevel 3): https://www.dropbox.com/sh/bhkan3u8t115hhw/AADlce1rifWNYzGilarGtvoca?dl=0

I don't know how to do a memory map, you'll need to be slow and patient with me. :-P

dinchamion avatar Oct 19 '21 09:10 dinchamion

I've uploaded a player.log from the crash (with no loglevel) and the "idle" state I killed after a while (with loglevel 3): https://www.dropbox.com/sh/bhkan3u8t115hhw/AADlce1rifWNYzGilarGtvoca?dl=0

Unfortunately I can't really deduce anything from that, only crash logs are useful to me. I'll need to make the game crash with a debugger running on a machine that's local to me.

I'll see what I can do when I get home in a couple hours.

Are you also running Ubuntu 21.04?

mogery avatar Oct 19 '21 11:10 mogery

Are you also running Ubuntu 21.04?

Yes, 21.04 on a Pi4 8GB.

I've left the game running (or whatever it's doing) in the morning, so we'll see if I have anything useful when I get home. (I'm hoping it'd crash instead of just hanging there, to get a crash log.)

dinchamion avatar Oct 19 '21 12:10 dinchamion

Could anyone zip up their game folder and send it to me to [email protected]? It would let me debug on my own terms and would be amazing

mogery avatar Oct 19 '21 15:10 mogery

Alright, cool, got it, working on it. Thanks.

mogery avatar Oct 19 '21 16:10 mogery

Ok, something very weird is going on. Sometimes FreeDynablock is doublefreeing (doublefree or corruption (fasttop)). Sometimes Run gets R_RIP = 0, crashing the whole process. Sometimes mono_os_mutex_lock gets a bad mutex and gets EINVAL from pthread_mutex_lock. Sometimes the game's XML parsing just fails with abruptly ended strings or NullReferenceExceptions. Sometimes Mono misdetects a class as having weak fields, and then tries to register it in the GC, which fails, because the GC has no support for them. Sometimes FillBlock64's call to malloc fails with corrupted double-linked list. Sometimes the game starts completely fine. Thoughts, @ptitSeb? Smells like some sort of memory corruption or memory mapping issues to me.

mogery avatar Oct 19 '21 17:10 mogery

Yes, it's a memory corruption! It may be because a wrong function signature? Or a structure thta have a different size in x86_64 vs ARM64 but is not adjusted in box64 yet.

ptitSeb avatar Oct 19 '21 17:10 ptitSeb

Yes, it's a memory corruption! It may be because a wrong function signature?

Wrong function signature as in one that's wrapped?

mogery avatar Oct 19 '21 18:10 mogery