cloudstate icon indicating copy to clipboard operation
cloudstate copied to clipboard

Proxy native image don't reclaim memory

Open sleipnir opened this issue 5 years ago • 4 comments
trafficstars

I ran some tests with the native image and noticed that once allocated the memory never seems to return that memory back to the system. Evidence can be found at this link https://github.com/sleipnir/spring-boot-cloudstate-starter/blob/master/Beanchmark.md

sleipnir avatar Apr 09 '20 18:04 sleipnir

If I've understood correctly, what the graphs show is the native image allocates roughly 300mb from the OS, and never returns it? That's expected, it's how most Java virtual machines work, the one exception to this is Azul Zing, but SubstrateVM, which is what gets compiled into a native image, never returns heap memory to the OS. What these virtual machines do is, at the start, they malloc memory according to -Xms parameter. Then, typically they'll garbage collect eden space as normal, but each time a small amount of memory that eventually will be garbage collected ends up in the old space, and this causes it to grow. As this grows, eventually it fills up the initially allocated space, and that's when the JVM starts mallocing more memory, until it reaches the configured -Xmx. Once it reaches there, it garbage collects old space, and operation continues as normal, the JVM will garbage collect whenever old space is full, and never malloc any more memory. But it never frees it either.

As far as only "paying for what you use" is concerned, the level of granularity that we expect this to be at is the pod level. As load increases, we'll scale up the number of pods, and you'll pay more for pods, and scale them back down when not in use, including hopefully eventually to zero. But you'll always pay in increments whatever the configured heap size of the proxy (and your user functions) is.

jroper avatar Apr 15 '20 00:04 jroper

Hello @jroper , I know that any vm allocates in blocks and that this is done for various performance reasons because malloc operations are extremely expensive and can generate fragmentation, etc. But compare the performance of the proxy and the user role and you will see that the JVM scales up and down by much better orders of magnitude. It may be just configuration, but there is definitely something. Regarding the pay for what you use I think you pay for any resource you use and not just the pod, but if this is the granularity adopted then ok.

sleipnir avatar Apr 15 '20 00:04 sleipnir

Just to give you an idea, it took several hours of idle time until the memory returned to its initial limit. And remembering that first, I didn't say that there was a problem with the Cloudstate Proxy but with SubstrateVM's behavior. And second, the memory metrics for SubstrateVM and JVM were the same, that is, I was not just talking about Heap, but about RSS. I'm sure that, as you said yourself, no performance adjustments have been made yet much will be improved in the future for the native image.

sleipnir avatar Apr 15 '20 00:04 sleipnir

This seems to be related to this issue of GraalVM. This link has some values that can be tested for libgraal, maybe you can play with it and see if it helps in something ;)

https://github.com/oracle/graal/issues/2224

Maybe I'll make time to play with that too

It seems to me that the default values for young generation and old generation are higher by default. That means the GC won’t even start collecting until certain limit of memory. As you have already noted. Adjusting these values mentioned in the link should solve the problem.

sleipnir avatar Apr 15 '20 01:04 sleipnir