valheim-server-docker
valheim-server-docker copied to clipboard
Each update grows the utilization of RAM
Hello,
After each update, a little more RAM is used and it grows until memory is full (16Go). I have to reboot for better performance. Is the same for everyone ? What can I do to avoid this ?
Thanks !
Don't think this is a "Question" but an actual issue, just worded poorly?
Also having this issue. Though mine seem to be with every backup.
Attached logs and also RAM usage from my Docker container.
If I leave Valheim running in Docker, it crashes the whole LXC... Which is no beuno and does not help a 24/7 server.
As you cans see from this screenshot, at around 10:36 (server time)
The logs are offset by 1hr (due to daylight savings), so at 11:36 it looks like the World backup also starts. The issue is that the RAM increases exponentially with every backup?
Running in a LXC within ProxMox. Have stopped all containers and the RAM increase happens exponentially only when Valheim is running.
Here is an image showing the LXC container crashing after every 30min backup increasing RAM until it crashes:
Confirming issue by running Portainer Stat page during backup and you see RAM increase during backup and then remain at the higher level. This would mean each backup would exponentially add to RAM usage:
Hello, thanks for your reply. I use proxmox too, I saw exactly the same thing. But since the last update (3 days ago) and UPDATE_CRON="*/60 * * * *" the increase in RAM usage is very limited. I keep an eye on it anyway.
edit : maybe the "" cancel updates... I restart without this environment variable.
Yeah I realised that my backups were running every 30mins but after reading the docs they should have defaulted to every hour???
Regardless, I've now swapped over to once daily with the CRON varialable.
Still, if running backups is for some reason adding RAM usage after every run, then a fix will still be required or we'll still have to be restarting the container every so often...
I also don't know what you mean by your edit? "" cancel updates
I mean, maybe the quotation marks were too much and negated the updates. So the last update of valheim-server-docker do not solve the issue.
The issue is still here :
I seem to be running in the same issue when the docker image does a world save. As you can see in the graph it causes a slight uptick in CPU and RAM usage during the backup. The CPU utilization goes back to normal, but the memory is never freed unless I restart the docker image.
same for us. Yesterday the server crashed for the first time due to 100% RAM usage. its starting with around 3,8GB and go up for about ~1GB per step.
Do you know where the world save is landing? If it's in some tmpfs this might make sense.
Otherwise, with such a reproducible problem, it should be pretty simple to track down. Take some snapshots of usage (from one of /proc/<pid>/{stat,status,statm}
where status is most approachable) and compare over time.
Thanks for the role since I've talked on this as well but I am also facing the same issue. Mitigate it in the mean time of essentially just set up a cronjob to restart the container every week. As you can imagine that's not ideal...
When I have some time I'll exec in and poke about..
I've tried to replicate the issue but couldn't. Can you see if there's a process that's getting stuck with each backup? Do you have any pre/post backup hooks that might hang?
For memory to leak there has to either be a continuously running process that's allocating more memory than it is freeing, or there have to be new processes occupying memory that are adding up with each backup run.
Looking at the backup script, the list of external commands it is calling are
zip
and cp
for the backup and find
, sort
, cut
, tail
, xargs
and rm
for the cleanup of old backups.
Steps that would help with debugging the issue, next time it happens:
- exec into the container and get a process list. See if there's an excessive amount of any of the before mentioned external commands running.
- identify which of the processes running inside the container are occupying which amount of core memory. This can be done inside the container or on the host system.
I also went over the valheim-backup
script itself but couldn't see any obvious spot where it would leak memory. As far as I can tell there is no point where it appends any data or otherwise uses a variable that grows in size. But I would appreciate a second pair of eyes. The script is only like 160 loc and fairly straightforward.
we disabled the valheim backup function and only use the backup function coming with this docker image. but its still using a lot of memory...
Hello :)
After monitoring for 2 days, I can tell that's the "buff/cache" wich grows according to the "free" command. The used mem is stable : 3Gi OS : Ubuntu Server 22.04.1
Then you should monitor for growth in /proc/meminfo and see which entry grows. And also review the meminfo section of proc.txt for more detail.
But the conventional wisdom is that if it's in buffers and cache, it's not really "used" because the kernel can give it to another process. See also: free(1) which explains the fields, and then from proc.txt above:
Buffers: Relatively temporary storage for raw disk blocks
shouldn't get tremendously large (20MB or so)
Cached: in-memory cache for files read from the disk (the
pagecache). Doesn't include SwapCached
So if there's really a leak and it's only manifesting as buff/cache from free growing, I don't think you've found the indicator.
@fuse1985 if you look in the running Docker container, and your backups are going to /backups, what does df -h /backups
show? (or wherever your backups are going)
@fuse1985 if you look in the running Docker container, and your backups are going to /backups, what does
df -h /backups
show? (or wherever your backups are going)
thats my backup settings btw, so only 1 backup a day and no backups from the game settings:
BACKUPS_CRON=0 3 * * *
BACKUPS_MAX_AGE=3
SERVER_ARGS=-backups 0 -saveinterval 1200
And those 4 backups are about ~150MB each.
I was thinking it could be a tmpfs, but doesn't seem to be... Seems like you must also be setting BACKUPS_DIRECTORY
if it's not /config/backups
though?
I added the "BACKUPS_MAX_COUNT=3" environment variable and now the RAM usage is stable around 3Gio. So I conclude that each backup was keep in cache in some way.
If that's the case you should be able to recover the memory using /proc/sys/vm/drop_caches
:
/proc/sys/vm/drop_caches (since Linux 2.6.16)
Writing to this file causes the kernel to drop clean
caches, dentries, and inodes from memory, causing that
memory to become free. This can be useful for memory
management testing and performing reproducible filesystem
benchmarks. Because writing to this file causes the
benefits of caching to be lost, it can degrade overall
system performance.
To free pagecache, use:
echo 1 > /proc/sys/vm/drop_caches
To free dentries and inodes, use:
echo 2 > /proc/sys/vm/drop_caches
To free pagecache, dentries, and inodes, use:
echo 3 > /proc/sys/vm/drop_caches
Because writing to this file is a nondestructive operation
and dirty objects are not freeable, the user should run
[sync(1)](https://man7.org/linux/man-pages/man1/sync.1.html) first.
[1] https://man7.org/linux/man-pages/man5/proc.5.html
This is the result of running the above that opello suggests (taken from portainer GUI):
ie:
# sync
# echo 3 > /proc/sys/vm/drop_caches
Note that this was run on the host system, not the container. Note that the container I'm running is running with the defaults (internal backups every 15 minutes, create worlds-YYYYmmdd-HHMMSS.zip
backup every 60 minutes), and running for about 2 days-ish. Note that the game db is roughly about 50MB give or take.
Note also that the performance of the system seemed fine. The system never reported that all that much RAM was being consumed on the host (free
never showed more than 7-8 gigs of usage, not bad considering its hosting 2 Valheim services and 3 Minecraft worlds), and the physical storage isn't too much either.
It's completely normal and advantageous for Linux to use memory for caching when there is memory available. If something needed this memory it would be made available to it. Dropping the caches when you aren't doing some kind of testing where you don't want the effects of caching to affect your results is just going to hurt performance for no reason.
If dropping the caches recovers the memory thought to be going to a leak it's safe to do nothing going forward.
While I agree that it's advantageous for a Linux system to fully utilize for buffers/cache any available RAM, I find it odd that the amount or RAM consumed continues to increase in lockstep with the valheim-updater
service in the container that's triggered every (by default) 15 minutes. ~Surely, there can't be any appreciable amount of performance gained by the Linux system for caching these .zip
files every 15 minutes+?~ (EDIT: removed this, line, as I was mis-reading the cron ENV var I setup - the 15 minutes is how frequently the updater runs, not the backup services) I'll let the system run for about a week+ and see what the buff/cache
looks like after that time period. Though based on relevant sar -r
data I've been tracking for a few days, the buff/cache
size isn't really increasing all that much I'll also see how long it takes to run out of RAM (Free+Swap) and the system crashes, as that appears to be the case for at least some people in this thread.
Besides, if this was an issue with the host consuming more buff/cache
, then this wouldn't show up through the container stats view in portainer. The other containers that I'm running on the host (a couple of Minecraft docker containers) are better-behaved, and don't exhibit the steady, continuous rise in RAM usage as reported by the container in 15 minute increments exactly corresponding with the valheim-updater
process, even when actively used. Either way, I'll actively not maintain the server (ie no stop/start of the valheim container) for the next 2 weeks and hopefully remember to report back here what I've found.
@jonvel Please also collect some additional data:
docker stats --no-stream --no-trunc <container>
I imagine this is an artifact of an older portainer, prior to v1.20.0 which introduced separating out the cache from the memory usage:
- https://github.com/portainer/portainer/issues/1961
- https://github.com/portainer/portainer/issues/2380
- https://github.com/portainer/portainer/issues/2074#issuecomment-429574769
Since docker stats
ignores caches having that along with the other data collected should be helpful in confirming.
Surely, there can't be any appreciable amount of performance gained by the Linux system for caching these .zip files every 15 minutes+?
On a system with low file I/O it seems reasonable that any write that can be held would be on the off chance it was read again, especially when it can easily be evicted.
Besides, if this was an issue with the host consuming more
buff/cache
, then this wouldn't show up through the container stats view in portainer.
Per https://github.com/portainer/portainer/issues/2380 it depends on the version. Based on your graph I expect yours is prior to v1.20.0 since it doesn't have the cache separate on the graph. The drop_caches affecting the graph based on your earlier comment also supports this conclusion.
As for your Minecraft containers, they probably aren't writing out new files but updating existing ones? I guess I'm not real familiar with how Minecraft persists changes at the server.
@opello - no, running the portainer container from about 2 weeks ago (v2.19.0, but Community Edition, because I'm cheap). However, the interesting parts here are:
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
3da1bdd5e346c1adf valheim.service 21.78% 4.627GiB / 30.71GiB 15.07% 8.29GB / 1.7GB 16.4GB / 37.3GB 64
Note that one of these days, I'll do better tuning to restrict how much RAM the container can use, but I don't do this professionally, and this is really only on a whim to learn more about containerization. The Minecraft services were mostly irrelevant, other than to point that I found it odd that so much "RAM" was being consumed. But that's fine. I'll let it sit for a while longer, and see what the stats look like in the future.
Also - I'll edit my response above, but the steps up in usage occur not when a backup is created but when the valheim-updater
process runs (once every 15 minutes).
@jonvel, sorry for the misread, I was stuck thinking about backups and you'd said it was the updater.
This is starting to make sense after reviewing the valheim-updater code and the stdout log. When it runs it does a few things, most relevant to this issue is asking steamcmd.sh
to +app_update
but it also passes STEAMCMD_ARGS
which according to the README.md
(and the behavior in my log anyway) includes validate
:
INFO - Downloading/updating/validating Valheim server from Steam
Redirecting stderr to '/home/valheim/Steam/logs/stderr.txt'
Logging directory: '/home/valheim/Steam/logs'
[ 0%] Checking for available updates...
[----] Verifying installation...
Steam Console Client (c) Valve Corporation - version 1694466999
-- type 'quit' to exit --
Loading Steam API...dlmopen steamservice.so failed: steamservice.so: cannot open shared object file: No such file or directory
OK
Connecting anonymously to Steam Public...OK
Waiting for client config...OK
Waiting for user info...OK
Update state (0x5) verifying install, progress: 0.14 (2097152 / 1515942651)
Update state (0x5) verifying install, progress: 21.70 (329012463 / 1515942651)
Update state (0x5) verifying install, progress: 52.05 (789104462 / 1515942651)
Update state (0x5) verifying install, progress: 84.08 (1274660014 / 1515942651)
Success! App '896660' fully installed.
.d..t...... ./
INFO - Valheim Server is already the latest version
https://developer.valvesoftware.com/wiki/SteamCMD#Downloading_an_App
To also validate the app, add
validate
to the command.
So, during the update every 15 minutes, steamcmd is validating the download, which means it's reading the ~1.4GiB of the download directory from disk. This makes sense to load into cache and not hit the slower disk if there's available memory. I think it's "not a concern" but if you disagree you should be able to remove validate
from STEAMCMD_ARGS
.
I'm not sure why cache isn't separate from memory in your latest portainer version, but that drop_caches makes the number go down clearly shows that caches are included in the value.
I'm seeing similar issues where my swap is getting maxed out and slowing everything down to the point where the server connection is timing out:
Output from free -h
:
total used free shared buff/cache available
Mem: 7.7Gi 2.9Gi 119Mi 22Mi 5.0Gi 4.8Gi
Swap: 2.0Gi 2.0Gi 33Mi
Output from top
with SWAP column:
PID USER PR NI VIRT RES SHR S %CPU %MEM SWAP TIME+ COMMAND
10889 root 20 0 9915528 373348 53976 S 10.60 4.632 1.808g 39:21.68 valheim_server.
In preliminary troubleshooting, this does appear to happen after the update check.