castle-engine icon indicating copy to clipboard operation
castle-engine copied to clipboard

High memory usage when loading a large number of animated sprites

Open peardox opened this issue 3 years ago • 10 comments

As noted in Discord...

Loading 4 Texture atlases, each image = 3840 x 4000 (25x24 frames @ 160x160) A total of 680 animations of 1 to 8 frames each (the vast majority are 3 frame animations) Total frames used = 2064, 336 = empty (unreferenced) Idle memory usage (nothing loaded) = 40M Test memory usage (680 sprites animating on screen) = 3114M Windows App Frame rate - 2.65, Render Only = 25 (Mac 2.65 / 36)

Testcase - https://github.com/peardox/MultiSprite

Some useful variables...

gScale := 0.3375; // MainGameUnit.pas - 299 - initial sprite size, the default scale is to fit all sprites on a 1920x1080 fullscreen display gLimit := 680; // MainGameUnit.pas - 302 - How many sprites to create - 680 is "all of them", lower this number to check mem usage for less sprites

I've left a few simple key-bindings in as they're useful for testing as follows

1 - Force all sprites to original size [UpArrow] - Increase size of all sprites by 10% [DownArrow] - Decrease size of all sprites by 10% [ESC] - Quit S - Screengrab (to data/screengrab.png)

peardox avatar Apr 27 '21 08:04 peardox

Is this fixed by now?

CodingMadness avatar Sep 29 '21 09:09 CodingMadness

Nope -its on a todo list My latest version of lots of things at once goes runs out of memory even better :) https://github.com/peardox/FibonacciSphere2 - got a OOM with 4k objects on a 32gb pc

peardox avatar Sep 29 '21 09:09 peardox

I started playing with this, profiling memory using massif ( https://github.com/castle-engine/castle-engine/wiki/Profiling-Using-Valgrind ).

Did one optimization in https://github.com/castle-engine/castle-engine/commit/de92db8f047fdb535c362fc884688ea0cf082dcb , but it's small (70 MB in my tests -- on Linux x86_64).

Tested that compiling with CASTLE_SLIM_NODES defined also helps, by 500 MB.

These are still too small gains. Will work to investigate / optimize more.

michaliskambi avatar Oct 13 '21 21:10 michaliskambi

I know for 100% that the main problem is not in textures. Textures are correctly cached, only 4 are loaded (to RAM and then to GPU). 4 large textures in this testcase (3840x4000) take about 230 MB in memory and on GPU. This was confirmed by TextureMemoryProfiler in CGE and by doing a test with textures resized to 2x2, RAM usage drops as it should, by ~230 MB.

This could be decreased significantly by using GPU texture compression ( https://castle-engine.io/creating_data_auto_generated_textures.php ). Anyway it doesn't matter for this testcase...

... because we eat too much memory even when all the 4 textures are replaced with dummy 2x2 images. So I'll continue researching with dummy 2x2 images, as they reproduce the problem too.

This is good news, I mean the culprit is in data structures we manage then, and the structures for sprite sheets should be trivial.

michaliskambi avatar Oct 13 '21 21:10 michaliskambi

It should be noted that https://github.com/peardox/FibonacciSphere2 (same concept using 3d) I actually managed to use all 28Gb (the Russian cloud PC) system memory going from 2k -> 4k objects. I'll try this on new laptop as well (RTX 3060 GPU) in a mo...

peardox avatar Oct 13 '21 21:10 peardox

The culprit is that X3D nodes and their fields, in their current implementation, just weight way too much memory.

And this testcase iterates over all animations, and for each animation clones the entire X3D graph (that contains all the animations). So you get ~n^2 memory usage, and you have models with many animations (so you indeed make n large).

FSpriteSheet.AnimationCount 168
FSpriteSheet.AnimationCount 232
FSpriteSheet.AnimationCount 200
FSpriteSheet.AnimationCount 80

It is possible to prevent such large consumption in this particular app (remove unnecessary animations after cloning), but this is not the answer to the general problem in CGE. The testcase shows a real, simple use-case of sprite sheets (with maaaaany animations) on a map. This should be optimized better in CGE.

And it's not the first time I observed that our X3D fields memory usage is too high. I have a a plan how to optimize them a lot, though it's some work.

I'm going to do it :), it will take a few steps to really finish the optimization.

Things done now:

  • more efficient "step" animation (used by sprite sheets), so that interpolation data eats 2x less memory
  • simplified X3D nodes inheritance using TNodeFunctionality, removed interfaces (also makes easier Delphi compatibility)

More will follow.


Note: you have a memory leak in this application, you never free Stage created by

Stage := TCastleScene.Create(nil);

Simply changing owner to Application is enough.

This is not a culprit of this issue -- I mean you're not leaking memory at runtime (forgetting to free something that should be freed earlier than application end), the memory usage at runtime is still unacceptably large.

michaliskambi avatar Oct 18 '21 23:10 michaliskambi

And one more important optimization to sprite sheets nodes pushed.

The rest will be optimizing the X3D fields.

michaliskambi avatar Oct 18 '21 23:10 michaliskambi

Just did some tests of old vs new on Windows New code : Memory = 1950M, Load = 3.750s Old code : Memory = 2885M, Load =5.450s i.e. it's loading about 45% faster as an added benefit

Gonna try this on FibonacciSphere2 as that takes 12 mins + to load with a lot of objects (4k)

peardox avatar Oct 19 '21 01:10 peardox

Hmm - loading 3D sees a small drop in memory usage but increasing N * 2 results in N * 4 scene.load. I imagine this is linked to the same stuff as Sprites tho...

peardox avatar Oct 19 '21 02:10 peardox

Not a fix yet, but at least I can start employ optimizations thanks to auto-generated nodes code: https://github.com/castle-engine/castle-engine/commit/1c4319d64d4b4899d2db034e1627ac1763b95d3c

michaliskambi avatar Aug 22 '22 01:08 michaliskambi