panel icon indicating copy to clipboard operation
panel copied to clipboard

Possible Memory Leak

Open petulikan1 opened this issue 2 years ago • 89 comments
trafficstars

Current Behavior

As of writing this, we're currently dealing with a RAM issue. Once we start a Spigot server without any plugins just a basic server and we connect to the server more than 100 times, the RAM increases over time. Hovewer the RAM shown in the panel doesn't decrease at all. Once it's allocated it stays that way.

We've made several tests with heapdumps checking if some of the plugins actually doesn't have a memory leak but didn't find any.

image

Here's an image of a running server for more than 20 hours with 4 GB's of RAM, as this server doesn't creates a new threads it's not crashing but on a different server where threads are made, it is.

Expected Behavior

Panel should be able to decrease the RAM of the running container to prevent from unexpected crash. (OOM Killer disabled) Once the ram get's increased it doesn't decrease and it's a pain for the server itself when there's over 40 plugins over 100 players and unexpectedly crashing once it reaches the container limit because some plugins require to create a new thread and when there's no available memory for the thread itself (Native memory).

Steps to Reproduce

Make a server with a Paper jar, allocate 4 GB's of RAM and connect to it until it reaches 4 GB's. Leave it for like an hour and then you'll see the same RAM just as you left it.

Panel Version

1.11.1

Wings Version

1.11.0

Games and/or Eggs Affected

Minecraft (Paper)

Docker Image

ghcr.io/pterodactyl/yolks:java_17

Error Logs

No response

Is there an existing issue for this?

  • [X] I have searched the existing issues before opening this issue.
  • [X] I have provided all relevant details, including the specific game and Docker images I am using if this issue is related to running a server.
  • [X] I have checked in the Discord server and believe this is a bug with the software, and not a configuration issue with my specific system.

petulikan1 avatar Dec 22 '22 23:12 petulikan1

Have you changed the startup command for this server?

parkervcp avatar Dec 22 '22 23:12 parkervcp

For the testing server (where I tested the RAM just by joining it) no I didn't. Left it just as the panel made it.

java -Xms128M -Xmx4096M -jar server.jar

petulikan1 avatar Dec 23 '22 03:12 petulikan1

swapaccount=1 cgroup_enable=memory setup -> /etc/default/grub GRUB_CMDLINE_LINUX_DEFAULT & GRUB_CMDLINE_LINUX in the grub line those settings and reboot your dedi server. after that, you need to be sure the allocated swap is 0 in your game server configuration. use the default kernel or compiled by you, not a custom one (liquorix, xanmod) - for xanmod, maybe the LTS version can save you.

P.S: Update the docker to the latest version, I had a lot of problems with wings>1.5.3 version with older docker versions.

Mutex21 avatar Dec 23 '22 08:12 Mutex21

I have exact the same problem, and i have added the "swapaccount=1 cgroup_enable=memory setup" to the grub config...

MrBretze avatar Jan 24 '23 09:01 MrBretze

Did that help you resolve this problem?

petulikan1 avatar Jan 24 '23 09:01 petulikan1

grub: swapaccount=1 cgroup_enable=memory docker info and update the docker to the latest version and Allocated Swap from pterodactyl panel (game server) should be 0.

p.s: you need to restart the entire dedi server

Mutex21 avatar Jan 24 '23 09:01 Mutex21

Alright thanks, I'll try this and then respond if that worked or not.

petulikan1 avatar Jan 24 '23 09:01 petulikan1

I have made this change (swapaccount and cgroup_enable has already in this mod in the grub config). I have rebooted my dedi server and my server doesn't no decrease ram

My dedicated server is a Ryzen 5 3600 and 16Gb of RAM and a RAID 1 of 1TB HDD

if you want another information, just tell me

OS:

Ubuntu 22.04.1 LTS

Docker info

https://pastebin.com/574zu870

Grub Info

https://pastebin.com/ka4nuanG

Panel

1.11.2

Wings

1.11.0

Docker Image

ghcr.io/pterodactyl/yolks:java_17

MrBretze avatar Jan 29 '23 16:01 MrBretze

I have made this change (swapaccount and cgroup_enable has already in this mod in the grub config). I have rebooted my dedi server and my server doesn't no decrease ram

My dedicated server is a Ryzen 5 3600 and 16Gb of RAM and a RAID 1 of 1TB HDD

if you want another information, just tell me

OS:

Ubuntu 22.04.1 LTS

Docker info

https://pastebin.com/574zu870

Grub Info

https://pastebin.com/ka4nuanG

Panel

1.11.2

Wings

1.11.0

Docker Image

ghcr.io/pterodactyl/yolks:java_17

So it was just slowly increasing and at no point decreasing right?

petulikan1 avatar Jan 29 '23 17:01 petulikan1

I tried this as well and RAM wasn't decreasing so I'm assuming it's a real problem.

petulikan1 avatar Jan 29 '23 17:01 petulikan1

So it was just slowly increasing and at no point decreasing right?

Yes

I tried this as well and RAM wasn't decreasing so I'm assuming it's a real problem.

Yes or configuration problem, but I don't know what is the problem...

MrBretze avatar Jan 29 '23 17:01 MrBretze

So after some testing, I found a "Java issue" for this one, java NEVER clears the G1 old garbage collector. It's supposed to do it automatically, but I don't know why it doesn't do it automatically with docker.

I tried to add Java Arguments -XX:+UnlockExperimentalVMOptions and XX:+UseContainerSupport but it doesn't help/change the problem

With the spark plugin, if I execute the command spark heapsummary It's forced to clear the G1 old garbage collector and the memory used by the server decreases.

I have forced Java to use the parallel collector (instead of the G1 one) and although I have no memory issues, i get lag spikes so this seems not to be a viable solution.

For now I've added the Java arguments -XX:MinRAMPercentage=25.0 -XX:MaxRAMPercentage=50.0 and it partially works.

I supose its a docker/egg issue not related to the panel...

I have found those two links that may be helpful: https://developers.redhat.com/blog/2017/03/14/java-inside-docker https://www.merikan.com/2019/04/jvm-in-a-container/

MrBretze avatar Feb 07 '23 09:02 MrBretze

Thanks for letting me know about this one. I'll look into that once I'm back home.

Once again thanks.

petulikan1 avatar Feb 07 '23 09:02 petulikan1

Has anyone tested running the exact same version of Paper outside of Pterodactyl? Does your system also report the same amount of memory consumption (e.g. htop)?

schrej avatar Feb 07 '23 10:02 schrej

Does your system also report the same amount of memory consumption (e.g. htop)?

Yes its report the same amout of memory,

Has anyone tested running the exact same version of Paper outside of Pterodactyl?

I tested outside of pterodactyl, I don't see any problem, but I need to retry this correctly

MrBretze avatar Feb 07 '23 16:02 MrBretze

I have also been having this issue,

been finding for a while now that the docker containers have been using a lot more ram then the servers have been.

It has been common to see one of our minecraft servers that is set to use -Xmx16G to start using 20gb+ after a few hours. God forbid you don't set a container ram limit, iv seen 40GB+.

I noticed this started happening after updating to Ubuntu 22.04.1 LTS from Ubuntu 18. (inherently docker was also updated but i don't know what version we was using)

From all the things i have tried i get the feeling it's a docker related issue, since i was able to recreate this by manually booting a server in docker and seeing the same over memory consumption by the container.

KugelblitzNinja avatar Feb 09 '23 19:02 KugelblitzNinja

Hey! I have the same issue,

Has anyone found any reasonable sollution? I thought that the problem is connected with plugins. Tried to run different servers, w/o plugins, different versions. And still has the issue. Attaching more RAM to the container helps, but I wonder if SWAP memory can also help.

Anyway, any response with some updated information or solution will be appreciated! Gosh.. Well, at least the problem for sure isn't connected with plugins.

Loren013 avatar Feb 18 '23 16:02 Loren013

Has anyone tested running the exact same version of Paper outside of Pterodactyl? Does your system also report the same amount of memory consumption (e.g. htop)?

So after having performed tests outside of pterodactyl, i don't have any issues regarding memory usage. However, I suppose JVM and Docker are the troublemakers here

MrBretze avatar Feb 20 '23 15:02 MrBretze

@KugelblitzNinja and @Loren013 Currently, the only 'fix' I have found is to use this command line: java -Xms128M -XX:+UseContainerSupport -XX:MinRAMPercentage=25 -XX:MaxRAMPercentage=50 -jar {{SERVER_JARFILE}} It's important not to specify the 'Xmx' Java argument, otherwise it won't work.

MrBretze avatar Feb 20 '23 16:02 MrBretze

@KugelblitzNinja and @Loren013 Currently, the only 'fix' I have found is to use this command line:

java -Xms128M -XX:+UseContainerSupport -XX:MinRAMPercentage=25 -XX:MaxRAMPercentage=50 -jar {{SERVER_JARFILE}}

It's important not to specify the 'Xms' Java argument, otherwise it won't work.

Specify Xms or Xmx? Asking because in the startup command you wrote 'Xms' and you're saying not to specify the 'Xms'

petulikan1 avatar Feb 21 '23 08:02 petulikan1

Specify Xms or Xmx? Asking because in the startup command you wrote 'Xms' and you're saying not to specify the 'Xms'

Oops indeed I was wrong! Sorry !

MrBretze avatar Feb 21 '23 08:02 MrBretze

-XX:MaxRAMPercentage=50 can work but is a pain, wasting so much ram.

The most useful thing I have tried so far is playing around with diffrent docker bases.

For me its been less of an issue when using docker containers with a base of debian.

Currently i am using kugelblitzninja/pterodactyl-images:debian-zulu-openjdk-19 <-- If you do try using this please let me know how it works for you and also be aware this has extra software added to create a backup on server shutdown!

With this we can run somthing between -XX:MaxRAMPercentage=80 / -XX:MaxRAMPercentage=93.

KugelblitzNinja avatar Feb 21 '23 10:02 KugelblitzNinja

On a side note, The follwing is to help on another issue this issue can cause.

If you have not already, disabled the OOM killer.

If you find your servers still being killed insted of going into a zobmie like state (untill assigned more ram), When running low in free ram in the container.

There is a good chance like me you will find out like me that the panal was unable to disable the OOM killer but did not say anything, and the recommended edits in there discord to the grub files was of no help.

I was able to verify this by reading the system logs.

If you find this is the case your going to have to take to google to find other ways to allow the panel to disable it.

An issue I have only has so far had with the host being ubuntu 22.

KugelblitzNinja avatar Feb 21 '23 10:02 KugelblitzNinja

I forgot to mention something important in the above,

If your server is getting killed by the OOM killer, This dose not mean your mincraft server ran out of ram! Just that the container is now using it's allocated amount.

(being aware the the OOM killer can fail to be disabled with no notification in the panel)

To see if your minecraft server actually ran out of ram check your log fies to see why it had crashed, you will see out of memory type exceptions, in your server logs file and or in you JVM crash report (look for somthing like hs_err_pid29.log in the root of your server).

if it just died with no error in the log files and they just end, It was the OOM killer. (This can cause world corruption).

KugelblitzNinja avatar Feb 21 '23 11:02 KugelblitzNinja

To everyone reading this issue and editing their kargs, stop. Reverting to cgroups v1 is not a solution or a fix to any of these problems, it's a terrible workaround that causes more problems, not less.


This problem is caused by many different factors, and is an issue specifically regarding the images and startup commands themselves, nothing else.

First off, setting Xmx to the allocated amount of memory the container has allocated will cause a OOM if all that memory is actually used. If the JVM uses all the memory assigned to the container, there is little to no memory left for anything outside the JVM; Java itself doesn't just use the JVM and requires memory for outside of it. (Setting Xmx also overrides the MaxRAMPercentage flags and disables the automatic container memory limit detection built into newer versions of Java).

Secondly, the ghcr.io/pterodactyl/yolks:java_8 and ghcr.io/pterodactyl/yolks:java_11 both lack container detection support (I am working on a fix for this). They will instead detect 1/4 of the memory available on the host system by default, which will then be affected by the MaxRAMPercentage flag. So if you are running these images and experiencing issues, you will want to set -Xmx to a value below the allocated amount of memory to the container, overhead of a 128MB or so should be more then enough. And for those wondering, no the -XX:+UseContainerSupport flag does not help, and is only required for Java 8; Java 10 and above have it enabled by default, assuming the build of Java actually has the feature, which these specific builds seem to lack. The ghcr.io/pterodactyl/yolks:java_8j9 image does have support for containers, but the -XX:+UseContainerSupport flag will need to be added for it to work.

Finally, for all the Java versions with container detection support, the default MaxRAMPercentage of 95.0 does not provide enough overhead. Because the memory value will be detected as what the container is allocated, the built-in memory overallocation logic in Wings (we assign additional memory to containers rather than the exact amount specified to help prevent issues with OOM) is included in the RAM calculation, meaning the only overhead available is 5%. A MaxRAMPercentage value of 80-90% would allow for much more overhead. The more RAM your server has assigned, the higher this value can be (within reason).


For most users (especially running newer or latest versions of Java), everything should work fine out of the box. However tweaking of the MaxRAMPercentage flag will likely be required for many users.

matthewpi avatar Feb 24 '23 19:02 matthewpi

@matthewpi

Even with MaxRAMPercentage set too 50% to 75% given a few days on our servers, if the OOM is not disabled it still gets killed by it. Even with containters that have 20GB to 30GB of RAM.

Would you have any advice on why the containers are trying to use so much extra ram? Any ideas of possible tools and or guides that can be used to diagnose the issue? Any other suggestions on what other things could be tweeked?

Like this server image(2) 6GB overhead is a bit too much.

edit : (I dont consider OOM here the primary issue here, more of why the hell is 6GB+ is needed for overhead)

KugelblitzNinja avatar Feb 26 '23 01:02 KugelblitzNinja

Hey guys, I've got a small update maybe related to this issue. Not sure what might be causing this, but there's this kind of a limit of threads and it reached it's max limit and is not able to create more of them even tho we have unlimited memory for the server. Hope it helps figuring out what could be wrong! image image

petulikan1 avatar Feb 28 '23 15:02 petulikan1

@petulikan1

For that i think you need to have a look at https://pterodactyl.io/wings/1.0/configuration.html#container-pid-limit , if its still at container_pid_limit: 512 then you going to want to increase it.

It is also worth confirming your host did not run out of ram.

KugelblitzNinja avatar Feb 28 '23 17:02 KugelblitzNinja

@petulikan1

For that i think you need to have a look at https://pterodactyl.io/wings/1.0/configuration.html#container-pid-limit , if its still at container_pid_limit: 512 then you going to want to increase it.

It is also worth confirming your host did not run out of ram.

Thanks for the link, was trying to find something related to that but wasn't able.

petulikan1 avatar Feb 28 '23 17:02 petulikan1

Any updates? I have the same problem

Hamziee avatar Mar 04 '23 23:03 Hamziee