Valheim Dedicated Server Crashes Post Unity 2022 Update
Hello there,
I had been running Valheim dedicated server on Ampere arm based server, using the below install script: https://gist.github.com/husjon/c5225997eb9798d38db9f2fca98891ef This was working for quite some time. However, as of a November 7th update of the server and clients to Unity 2022, the server will disconnect clients and then crash about 20 seconds after a client connects. I can confirm I am pulling and compiling a fairly recent version of box64:
box64 --version
Dynarec for ARM64, with extension: ASIMD AES CRC32 PMULL ATOMICS PageSize:4096 Running on Neoverse-N1 with 4 Cores
Params database has 30 entries
Box64 with Dynarec v0.2.5 2a4fe803 built on Dec 24 2023 07:41:33
Finally, I would like to submit logs, but the logfiles generated are over 350 MB in size. How would I submit them here?
I am generating these logs via BOX64_LOG=2 BOX64_TRACE_FILE=valheim_arm_crash.txt /home/ubuntu/valheim_server/valheim_server.x86_64 -nographics - batchmode -port 2456 -public 1 -name serverName -password sometemporarydummypassword -savedir /home/ubuntu/valheim_data
Thank you for your time and for this awesome project.
Valheim server still works using Box64 v0.2.4 on my end (RPI4 4GB, Ubuntu 64bit), but fails to start using v0.2.6.
However, starting with v0.2.4 leads to a server crash after an arbitrary amount of time (leaving behind mono_crash memory dumps)
@sea212 did you try with current version also? Can you bisect the issue if it's still broken?
@ptitSeb I'll provide more information shortly.
When tried to bisect I realized that I can't reliably reproduce the error that the server hangs on launch. Sometimes it does, most of the times it seems to work.
If it's a random issue, you can try to start the server with BOX64_DYNAREC_STRONGMEM=1 (or play with ~/.box64rc if you know it), that might be a needed params.
If it's a random issue, you can try to start the server with
BOX64_DYNAREC_STRONGMEM=1(or play with~/.box64rcif you know it), that might be a needed params.
To clarify, should I stay on a more recent built of box64, or revert to 0.2.4 before trying this flag?
Also, I would like to provide logs, but they are still huge with
BOX64_LOG=2 BOX64_TRACE_FILE=valheim_arm_crash.txt
Is there any way to make them smaller? Thanks again for the help
If it's a random issue, you can try to start the server with
BOX64_DYNAREC_STRONGMEM=1(or play with~/.box64rcif you know it), that might be a needed params.To clarify, should I stay on a more recent built of box64, or revert to 0.2.4 before trying this flag?
Stay on current, it would be easier for me. Als, if it just work with BOX64_DYNARC_STRONGMEM=1 that means that would be the solution! The parameter could go in a config file so it's automaticaly applied.
Also, I would like to provide logs, but they are still huge with BOX64_LOG=2 BOX64_TRACE_FILE=valheim_arm_crash.txt
Is there any way to make them smaller? Thanks again for the help
You can use BOX64_ROLLING_LOG=1 instead, that would just show the last 10 function call before the crash. But if the issue is a multi-thread isseu, those logs will be hardly usefull anyway, as each crash would show something different...
@ptitSeb Ok, I recompiled to make sure I am on the latest version
box64 --version
Dynarec for ARM64, with extension: ASIMD AES CRC32 PMULL SHA1 SHA2 PageSize:4096 Running on Neoverse-N1 with 4 Cores
Params database has 48 entries
Params database has 48 entries
Box64 with Dynarec v0.2.7 41bfd757 built on Jan 7 2024 20:31:39
I ran this below and was able to generate a more interesting log
BOX64_DYNARC_STRONGMEM=1 BOX64_ROLLING_LOG=1 BOX64_TRACE_FILE=valheim_arm_crash.txt /home/ubuntu/valheim_server/valheim_server.x86_64
\ -nographics -batchmode -port 2456 -public 1 -name serverName -password sometemporarydummypassword
\ -savedir /home/ubuntu/valheim_data
In particular
Warning: Global Symbol _ZTH15gDeferredAction not found, cannot apply R_X86_64_GLOB_DAT @0x7fff01d3d660 ((nil)) in /home/ubuntu/valheim_server/UnityPlayer.so
Error loading needed lib libpulse-mainloop-glib.so.0
Error loading one of needed lib
Error initializing needed lib /home/ubuntu/valheim_server/valheim_server_Data/Plugins/libparty.so
Which is odd, previously libpulse-mainloop was only needed for crossplay, so maybe the default behavior is now crossplay on?
I'm on an RPI 5 running Ubuntu Server 23.10, getting the same issue as those above - silent crashing after an arbitrary period of time.
I'm running
Dynarec will try to emulate a strong memory model with limited performance loss
Dynarec for ARM64, with extension: ASIMD AES CRC32 PMULL SHA1 SHA2 PageSize:4096 Running on Cortex-A76 with 4 Cores
Params database has 48 entries
Box64 with Dynarec v0.2.7 328d5c62 built on Jan 11 2024 19:17:20
To note, I'm seeing occasional Error loading one of needed lib errors, such as:
Error loading needed lib libparty.so
Warning: Cannot dlopen("libparty.so"/0x71ab21f0, 101)
Using emulated ./linux64/steamclient.so
Redirecting overridden malloc from symtab function for ./linux64/steamclient.so
Warning: Weak Symbol _ITM_RU1 not found, cannot apply R_X86_64_JUMP_SLOT @0x7fff0c233fb8 (0x32a296)
Warning: Weak Symbol _ZGTtnam not found, cannot apply R_X86_64_JUMP_SLOT @0x7fff0c233fc0 (0x32a2a6)
Warning: Weak Symbol _ITM_memcpyRtWn not found, cannot apply R_X86_64_JUMP_SLOT @0x7fff0c233fc8 (0x32a2b6)
Warning: Weak Symbol _ITM_RU8 not found, cannot apply R_X86_64_JUMP_SLOT @0x7fff0c233fd0 (0x32a2c6)
[S_API] SteamAPI_Init(): Loaded local 'steamclient.so' OK.
Using native(wrapped) crashhandler.so
CAppInfoCacheReadFromDiskThread took 1 milliseconds to initialize
Error loading needed lib steamservice.so
Warning: Cannot dlopen("steamservice.so"/0xffff786ba8b0, 2)
dlmopen steamservice.so failed: Cannot dlopen("steamservice.so"/0xffff786ba8b0, 2)
Setting breakpad minidump AppID = 892970
SteamInternal_SetMinidumpSteamID: Caching Steam ID: 76561197960265728 [API loaded no]
Error loading needed lib libsteam.so
Warning: Cannot dlopen("libsteam.so"/0x7fff07038ad9, 2)
[S_API FAIL] Tried to access Steam interface SteamNetworkingUtils004 before SteamAPI_Init succeeded.
Let me know if there's anything else I can provide.
In addition to the above, I took a log of my last run of the game using the following shell script:
#!/bin/bash
export templdpath=$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=./linux64:$LD_LIBRARY_PATH
export SteamAppId=892970
export BOX64_SHOWBT=1
export BOX64_LOG=1
export BOX64_TRACE_FILE=log.txt
export BOX64_DYNAREC_STRONGMEM=1
echo "Starting server PRESS CTRL-C to exit"
# Tip: Make a local copy of this script to avoid it being overwritten by steam.
# NOTE: Minimum password length is 5 characters & Password cant be in the server name.
# NOTE: You need to make sure the ports 2456-2458 is being forwarded to your server through your local router & firewall.
./valheim_server.x86_64 \
-name "something" \
-port 2456 \
-world "something" \
-password "secret" \
-nographics \
-batchmode \
-public 0 \
export LD_LIBRARY_PATH=$templdpath
Pardon my ignorance, but it's curious to see so many errors about libpulse-mainloop-glib.so.0 - If I run sudo apt install libpulse-mainloop-glib0 I'm apparently on the newest version. party.so straight up doesn't exist, but libparty.so does in /valheim_server_Data/Plugins. steamservice.so doesn't appear to exist on my system.
I am having a similar issue. I do not have BOX64_DYNARC_STRONGMEM set to 1. Details from box64 --version: Dynarec for ARM64, with extension: ASIMD CRC32 PageSize:4096 Running on Cortex-A72 with 4 Cores Params database has 49 entries Box64 with Dynarec v0.2.7 57ca9dfd built on Jan 18 2024 22:03:56
Whats interesting is, sometimes the Valheim service will run fine for a bit (up to 8 hrs it ran once!), sometimes it will only live for roughly 5 minutes before silent failure.
Recently though, the Valheim service has been silently failing typically after an hour, usually sooner in recent days.. can't get it to run up to anywhere near 8 hours again...
If it would help for me to run and trace logs I can do so if it helps resolve this sooner. I am a novice at linux and bash however but I think I could handle running a similar script to that mentioned above.
@nitroinferno what if you run with BOX64_DYNAREC_STRONGMEM=1? Does it make the server more stable?
@ptitSeb How do I run the valheim service with BOX64_DYNAREC_STRONGMEM=1 from terminal? For example I run the server exec from valheim.service, and typically do 'systemctl start valheim' right in the terminal.
@ptitSeb How do I run the valheim service with BOX64_DYNAREC_STRONGMEM=1 from terminal? For example I run the server exec from valheim.service, and typically do 'systemctl start valheim' right in the terminal.
The simpler would be to use either /etc/box64.box64rc or if the account used for the service has a home, create a .box64rc file in that home and put inside:
[valheim_server.x86_64]
BOX64_DYNAREC_STRONGMEM=1
And the parameter will automatically picked up.
@ptitSeb I already ran it using BOX64_DYNAREX_STRONGMEM=1. Inspecting the logs with and without STRONGMEM, I figured out box64 loads libmono which automatically toggles BIGBLOCK=0 and STRONGMEM=1. Inspecting the folder, it also becomes apparent that mono is involved in the sporadic crash, as there are mono memory dumps created when the server silently crashes (the process still runs, but it doesn't provide any of the expected functions).
@ptitSeb How do I run the valheim service with BOX64_DYNAREC_STRONGMEM=1 from terminal? For example I run the server exec from valheim.service, and typically do 'systemctl start valheim' right in the terminal.
The simpler would be to use either
/etc/box64.box64rcor if the account used for the service has a home, create a.box64rcfile in that home and put inside:[valheim_server.x86_64] BOX64_DYNAREC_STRONGMEM=1And the parameter will automatically picked up.
Thank you, I set it under the /etc/box64.box64rc, since the service runs from /etc/ directory. Seems the program is staying alive stable now (could also be luck of the draw). However, its been running for just a bit over 2 hours, which it hasn't done for awhile. I will continue to monitor it. Its been quite some time that it ran for this long without the silent failure.
To sea212's point, I do have mono_crash.mem files within ~/home/
Update: Seems it was luck of the draw it ran 2 hours. After restarting it didn't stay online for longer than 5minutes, then subsequent restarts wouldn't last beyond 15m.
Did a bit more digging - after my server ran for 40hours straight, no issue hosting up to 3 players at different times. Until silently crashing again at 40 hours, when no one was logged in. Since then it typically fails after 15-45min.
I went into the directory of the non-responsive PID ( /proc/$[PID] ) to do some investigation. I found the following: Using cat /proc/$[PID]/wchan it returns: futex_wait_queue
Using Sudo cat /proc/$[PID]/syscall it returns something along the lines of: 98 0x150034780 0x80 0x153 0x10080e018 0x0 0x0 0x7fdcdc9a80 0x7facba3c28
This does not change, while the program is stuck. it's always 98, and always 0x80 for third variable and also always has '0x10080e018 0x0 0x0' Found within /usr/arm-linux-gnueabihf/include/asm-generic that 98 corresponds to:
#define __NR_futex 98
__SC_3264(__NR_futex, sys_futex_time32, sys_futex)
#endif
Using Sudo cat /proc/$[PID]/stack it always returns the following once the program is stuck:
[<0>] futex_wait_queue+0x78/0xac
[<0>] futex_wait+0x100/0x1c0
[<0>] do_futex+0xec/0x194
[<0>] __arm64_sys_futex+0x84/0x19c
[<0>] invoke_syscall+0x50/0x120
[<0>] el0_svc_common.constprop.0+0x68/0x124
[<0>] do_el0_svc+0x34/0xd0
[<0>] el0_svc+0x30/0x94
[<0>] el0t_64_sync_handler+0xf4/0x120
[<0>] el0t_64_sync+0x18c/0x190
This is all a bit over my head - but hoping this may help determine whether this is box64 related or not..
I believe there may be 2 issues at hand. The crashes on initialization / startup sometimes generate the mono_crash.mem files (these crashes are RARE - at least in my case). I overclocked by rpi4 to 2GHz and it stopped any and all crashing on startup / initialization. I am suspecting they may be somehow tied to the lower end clock rate of the CPU? (I have no gauge whether this makes logical sense or not)
I created a service which monitors to see if the primary .x86_64 PID has become stuck in 'Sleeping' status as seen in htop. This is an easy work around to the sporadic failed starts.
My main issue is how to diagnose or provide useful info on when the main process gets stuck in 'sleeping' state. I tried running with BOX64_LOG=1 but I don't think it provided any useful info; it pretty much looked similar to https://github.com/ptitSeb/box64/issues/1182#issuecomment-1892773295. I Tried running BOX64_LOG=2 and 3, and it both generated a log file that was enormous and it couldn't get through a startup / initialization routine. I believe the RPI4 isn't powerful enough to handle all this logging and initialization simultaneously.
From strace -c <PID of Valheim> I get a lot of errors reported for futex, restart_syscall, and read.
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
52.30 0.062819 112 557 59 futex
23.66 0.028420 1291 22 clock_nanosleep
14.65 0.017595 2199 8 4 restart_syscall
8.88 0.010666 2133 5 epoll_pwait
0.26 0.000316 26 12 clock_gettime
0.16 0.000193 193 1 sendto
0.04 0.000053 13 4 4 read
0.03 0.000037 37 1 recvfrom
0.00 0.000003 1 2 pselect6
0.00 0.000001 0 4 ppoll
------ ----------- ----------- --------- --------- ----------------
100.00 0.120103 194 616 67 total
I suspect the issue is a deadlock or resource contention issue causing the sporadic silent failures, where the main process cannot be returned to 'running' status after going to 'sleep'. Is there any BOX64 settings I can tweak to avoid this, or methods I can use to generate more useful log data?
note: I tried using gbd to get useful trace info logs both when running, and stuck in sleeping state. Not sure how useful these are I will attach one of each to this post. stack_traceRun.txt stack_tracescrash.txt
Thanks for those analysis, that some interesting stats.
It seems this is some multitasking issues. My feeling is that BOX4_DYNAREC_STRONGMEM is the main parameter to tweek (values from 0 to 3, 0 is default). But BOX64_DYNAREC_BIGBLOCK might have some effect too (values from 0 to 3, 1 is default).
If MonoBleedingEdge is loaded, it will override those settings, so you need to use BOX64_DYNAREC_BLEEDING_EDGE =0 to disable this lib detection have have full control on the settings.
Use the ~/.box64rc or /etc/box64.box64rc file with the [valheim_server.x86_64] section has mentionned earlier for easy tweaking.
Hi, author of the guide in OP here. Thank you all looking into this. :)
I've not had much time to look into this myself and I also brought down my original Ampere instance in OCI. A month ago I decided to spin up an instance for Terraria and this week I repurposed it for Valheim again to try to help out figuring out this issue.
The instance was running the latest version of Valheim Valheim l-0.217.38 (network version 20) using Box64 v0.2.6.
No changes were done to environment variables used by Box64 or box64rc.
There were a couple of packages that were needed for Terraria specifically but unfortunately I did not manage to note down the exact ones prior to tearing it down to redo the install (one I do remember on top of my head were libcurl3-gnutls).
I'm in the process of spinning up a new OCI instance using the same specs and will do a clean install of Valheim to narrow down the exact packages.
Edit: I forgot to add that the Valheim server was running over night with no signs of issues (total runtime was about 20 hours prior to teardown).
Edit2: Instance is up and has been running for about 3 hours, no additional steps have been done other than running the installation script from my guide. I will be letting the server run for a while, connect and play a bit over the weekend then monitor the logs over time to look for abnormalities. Gist containing details about the server packages etc can be found here and will be updated over time https://gist.github.com/husjon/a94b6760e036e83d0b67a04e3916033d
Update 1: The server usually logs the amount of connections every 10 minutes.
ala: 02/09/2024 22:15:15: Connections 0 ZDOS:30349 sent:0 recv:0
After about 4 hours (18:41 - 22:40UTC) the server has deadlocked.
having to send SIGILL to get it to stop.
This does at the very least give me a baseline to work from.
Update 2: After restarting from the last run and letting it run over night the server ran for just over an hour and deadlocked with no players online.
Update 3: It seem like the packages I mentioned was a false report, I must have used them for something else. I installed the Terraria Server and it started without adding more packages than were already installed, in any case I will start looking at using box64rc to see how the Valheim server behaves.
Hi All, We have two servers running on a Raspberry Pi 5. They've been up for around 60 hours, and 3 players, with this config (and still going strong):
[valheim_server.x86_64] BOX64_DYNAREC_STRONGMEM=3 BOX64_DYNAREC_BLEEDING_EDGE =0 BOX64_DYNAREC_BIGBLOCK=3
EDIT: The server run for around 100 hours without any problem, and it has been downgraded to a Raspberry Pi 4, that has been running for 10 hours without an issue.
Thanks for those analysis, that some interesting stats.
It seems this is some multitasking issues. My feeling is that
BOX4_DYNAREC_STRONGMEMis the main parameter to tweek (values from 0 to 3, 0 is default). ButBOX64_DYNAREC_BIGBLOCKmight have some effect too (values from 0 to 3, 1 is default). If MonoBleedingEdge is loaded, it will override those settings, so you need to useBOX64_DYNAREC_BLEEDING_EDGE =0to disable this lib detection have have full control on the settings. Use the~/.box64rcor/etc/box64.box64rcfile with the[valheim_server.x86_64]section has mentionned earlier for easy tweaking.
Similar to @ebarrragn I set the following: Environment=BOX64_DYNAREC_STRONGMEM=3 Environment=BOX64_DYNAREC_BIGBLOCK=2 Environment=BOX64_DYNAREC_BLEEDING_EDGE=0
Servers been MUCH more stable and running for long hours on a pi4 4gb. ran for 43 hours the other day which it has never done before for me. I believe modifying the settings may be the solution. I will have to try with bigblock = 3.
I've been trying to run the server in OCI on Ampere trying out a variety of environment variables but not getting any positive results, using the following environment variables:
BOX64_LOG=1
BOX64_DYNAREC_BIGBLOCK=3
BOX64_DYNAREC_STRONGMEM=3
BOX64_DYNAREC_BLEEDING_EDGE=0
BOX64_TRACE_FILE=/home/ubuntu/valheim.box64.log
The tracefile is spammed of the following segfaults triggered by FillBlock:
FillBlock at 0x2fdd70 triggered a segfault, canceling
FillBlock triggered a segfault at 0x300000 from 0x35012ef4
FillBlock at 0x2fddbc triggered a segfault, canceling
FillBlock triggered a segfault at 0x300000 from 0x35012ef4
FillBlock at 0x2fddd4 triggered a segfault, canceling
https://gist.github.com/husjon/a94b6760e036e83d0b67a04e3916033d#file-valheim-box64-deadlock-01-log It then refuses to continue or stop and I have to send SIGILL. I'll continue to play around with the variables to see if I can get some more info.
Update: Leaving only the following variables the server starts up normally
BOX64_LOG=1
BOX64_TRACE_FILE=/home/ubuntu/valheim.box64.log
normal startup log: https://gist.github.com/husjon/a94b6760e036e83d0b67a04e3916033d#file-valheim-box64-normal-log
Make sure you are using latest version box64 also. Or at least the same version across those various tests.
I've been on v0.2.6 for all tests so far, but I'll take a look at building and testing towards latest on main a bit later.
After settings BLEEDING_EDGE=0 and STRONGMEM=3, the server have now been running for just over 20 hours (using Box64 v0.2.6).
I will let the server run for a while longer to see how it behaves.
Update 1: The server has now been running for close to 30 hours and still chugging along, still I feel it might be too early to say for sure so I'll continue to monitor the server. Currently the server is using Steams multiplayer backend, but I'm curious to see how it would behave if Crossplay was enabled as that would use PlayFabs backend instead (this was a massive PITA when we tried getting it going before), I might spin up a separate instance for testing that at some point.
My server is encountering a crash after 5-15mins which subsequently throws a "failed to connect" when trying to reconnect. We end up needing to restart the server to connect successfully only for it to eventually crash again. This was with just 1-2 people to test the variables.
I'm running the suggested variables mentioned above:
BOX64_DYNAREC_BIGBLOCK=3
BOX64_DYNAREC_STRONGMEM=3
BOX64_DYNAREC_BLEEDING_EDGE=0
My server is encountering a crash after 5-15mins which subsequently throws a "failed to connect" when trying to reconnect. We end up needing to restart the server to connect successfully only for it to eventually crash again. This was with just 1-2 people to test the variables.
I'm running the suggested variables mentioned above:
BOX64_DYNAREC_BIGBLOCK=3 BOX64_DYNAREC_STRONGMEM=3 BOX64_DYNAREC_BLEEDING_EDGE=0
@Pohtaytoh Try setting the bigblock above to =2. Just a shot in the dark but mines been consistently working 40+ hours with it set to that with other settings above, though I'm hosting on a rpi4 without OCI ampere.
My server running on my OCI Ampere instance is still running smoothly after 50+ hours. I haven't had much time to play but I do hop onto it every now and then to explore and generate terrain etc.
● valheim_server.service - Valheim Dedicated Server
Loaded: loaded (/home/ubuntu/.config/systemd/user/valheim_server.service; enabled; vendor preset: enabled)
Active: active (running) since Sat 2024-02-10 15:51:57 UTC; 2 days ago
Main PID: 42305 (valheim_server.)
Tasks: 36 (limit: 14216)
Memory: 2.0G
CPU: 22h 5min 10.234s
CGroup: /user.slice/user-1001.slice/[email protected]/app.slice/valheim_server.service
Variables used as mentioned in my previous comment (with Box64 v0.2.6):
BOX64_DYNAREC_STRONGMEM=3
BOX64_DYNAREC_BLEEDING_EDGE=0
BIGBLOCK is not used.
Just wanted to follow up that it's been much more stable running with:
BOX64_DYNAREC_STRONGMEM=3
BOX64_DYNAREC_BLEEDING_EDGE=0
Though we're still getting a crash after several hours. Frequency seems to be around ~3-5hrs