puffin icon indicating copy to clipboard operation
puffin copied to clipboard

Doesn't memory keep growing?

Open plasticbox opened this issue 1 year ago • 5 comments

Describe the bug https://github.com/EmbarkStudios/puffin/pull/169

I get that it's PR for optimisation. But doesn't this result in an infinite number of ScopeCollections and thus an ever-increasing memory footprint? In fact, if you try running the current puffin_http example server, the memory keeps growing over time.

I'm just checking because after applying the release with this update, the memory exploded.

To Reproduce cd puffin_http cargo run --example server

Expected behavior Memory should not continue to grow.

plasticbox avatar Jan 25 '24 12:01 plasticbox

Thanks for the report. I can confirm there is a leak in the server example (or in puffin it self).There might be a bug in which case we make it prio to fix it. I'll have a look right now.

TimonPost avatar Jan 25 '24 13:01 TimonPost

Alright, I suspect that this is not a leak, but instead, this is the 'max recent' frames in FrameView that are being stored. By default, it stores 18000 records. But this value can be configured to be lower too. There seem to be now leaks regarding scope collections. Those are hashmaps and we do not keep creating new keys. After it hits that cap it will stop growing memory. When I do that I don't see my memory growing after hit hit this cap.

TimonPost avatar Jan 25 '24 14:01 TimonPost

Thanks for the quick check.

https://github.com/plasticbox/puffin/commit/2f30823448b9d912c6c75a13822ab28f9c8e8ce5 This is how I tested it.

As an extreme, I tried using 1 as the set_frame_view_max_recent value. Despite this, the memory seems to be increasing.

Is there anything else I can check?

I'm using bevy and using puffin::profile_function!() for each system function in about 4 threads. and calling new_frame for the entire bevy system function. getting fps in the tens to hundreds.

plasticbox avatar Jan 26 '24 05:01 plasticbox

You can use re_memory to profile the memory use.

cargo add re_memory

#[global_allocator]
static GLOBAL: AccountingAllocator<std::alloc::System> = AccountingAllocator::new(std::alloc::System);

re_memory::accounting_allocator::set_tracking_callstacks(true);

let it run for a while and then call

re_memory::accounting_allocator::tracking_stats

to get the statistics. The memory leak should show up in

https://docs.rs/re_memory/0.12.1/re_memory/accounting_allocator/struct.TrackingStatistics.html#structfield.top_callstacks

emilk avatar Jan 26 '24 08:01 emilk

Running the server with a limited frame view recent frames i get:

[21sec]: track_size_threshold: 128, untracked:  { count: 49, size: 1980 }, stochastically_tracked:  { count: 111, size: 20726 }, fully_tracked:  { count: 98, size: 1830976 }, overhead:  { count: 26, size: 52256 }
[42sec]: track_size_threshold: 128, untracked:  { count: 48, size: 1972 }, stochastically_tracked:  { count: 113, size: 21030 }, fully_tracked:  { count: 100, size: 1871218 }, overhead:  { count: 27, size: 55632 }
[63sec]: track_size_threshold: 128, untracked:  { count: 48, size: 1972 }, stochastically_tracked:  { count: 113, size: 21030 }, fully_tracked:  { count: 100, size: 1857166 }, overhead:  { count: 27, size: 55632 }
[84sec]: track_size_threshold: 128, untracked:  { count: 48, size: 1972 }, stochastically_tracked:  { count: 114, size: 21142 }, fully_tracked:  { count: 101, size: 1890342 }, overhead:  { count: 27, size: 55632 }
[105sec]: track_size_threshold: 128, untracked:  { count: 48, size: 1972 }, stochastically_tracked:  { count: 115, size: 21334 }, fully_tracked:  { count: 102, size: 1911947 }, overhead:  { count: 27, size: 55632 }
[126sec]: track_size_threshold: 128, untracked:  { count: 48, size: 1972 }, stochastically_tracked:  { count: 114, size: 21182 }, fully_tracked:  { count: 101, size: 1869374 }, overhead:  { count: 27, size: 55632 }
[147sec]: track_size_threshold: 128, untracked:  { count: 48, size: 1972 }, stochastically_tracked:  { count: 115, size: 21334 }, fully_tracked:  { count: 102, size: 1907087 }, overhead:  { count: 27, size: 55632 }
[168sec]: track_size_threshold: 128, untracked:  { count: 48, size: 1972 }, stochastically_tracked:  { count: 114, size: 21182 }, fully_tracked:  { count: 101, size: 1885909 }, overhead:  { count: 27, size: 55632 }
[189sec]: track_size_threshold: 128, untracked:  { count: 48, size: 1972 }, stochastically_tracked:  { count: 114, size: 21182 }, fully_tracked:  { count: 101, size: 1880389 }, overhead:  { count: 27, size: 55632 }
[210sec]: track_size_threshold: 128, untracked:  { count: 48, size: 1972 }, stochastically_tracked:  { count: 115, size: 21334 }, fully_tracked:  { count: 102, size: 1906796 }, overhead:  { count: 27, size: 55632 }

And after a couple hours:

[10510sec]: track_size_threshold: 128, untracked: CountAndSize { count: 34, size: 1275 }, stochastically_tracked: CountAndSize { count: 114, size: 22062 }, fully_tracked: CountAndSize { count: 101, size: 1926586 }, overhead: CountAndSize { count: 35, size: 81784 }

This looks fine, the numers are quite stable after 200 secs. I do notice it increases as expected when not changing the recent frames. I think we should configure this like you did in your commit because we do not allow this at the moment.

Also I ran it for over 30 minutes and do not notice any increase in memory. Are you sure this is not a benvy problem? What version are you using, did they do the migration right, are they having a custom logic for customs copes?

TimonPost avatar Jan 26 '24 10:01 TimonPost