sysinfo
sysinfo copied to clipboard
Possible memory leak in `Disks::refresh_list`
Describe the bug
System: macOS Sonoma 14.1.2 (23B92)
sysinfo version: 0.30.12
When calling Disks::refresh_list() repeatedly in a loop, the memory the application uses grows without bound. Variations shown in the minimum reproducible example below:
To Reproduce
use std::thread::sleep;
use sysinfo::{Disks, MINIMUM_CPU_UPDATE_INTERVAL};
fn main() {
// Memory leak when using `new_with_refreshed_list`:
// loop {
// let _ = Disks::new_with_refreshed_list();
// sleep(MINIMUM_CPU_UPDATE_INTERVAL);
// }
// Memory leak when using `refresh_list`:
// let mut disks = Disks::new_with_refreshed_list();
// loop {
// disks.refresh_list();
// disks.refresh();
// sleep(MINIMUM_CPU_UPDATE_INTERVAL);
// }
// No memorty leak but `available_space` isn't updated when the space on my computer is
// actively changing:
let mut disks = Disks::new_with_refreshed_list();
loop {
disks.refresh();
let first_available_space = disks.first().map(|disk| disk.available_space()).unwrap();
println!("Available space: {first_available_space}");
sleep(MINIMUM_CPU_UPDATE_INTERVAL);
}
}
Using cargo instruments I ran an Allocations template for my program and was able to capture the following memory allocation trace:
That queryCache just grows over time
It means that CFURLCopyResourcePropertiesForKeys is allocating memory that is never freed. Sounds pretty bad.
Is it on mac M* or an older one? If it's on mac M*, I can't test it as I don't have one.
I'm experiencing it on an M*, but I think others have experienced it on Intel. Let me try find an Intel to run the example above on and see if it persists.
In the meantime, I downgraded to 0.29 and the issue persists on that version too.
I've been able to confirm that this behaviour does still occur on an Intel based Mac
Then I'll be able to check what's going on locally. Thanks for the information!
I just checked and... I have absolutely no idea where the leak comes from... To be honest, I'm pretty bad at using mac tools to try to check memory issues since they don't have valgrind and their other tools are confusing. So for now, I'll consider it's "ok". Don't hesitate to give it a try though, help to fix this issue would be very welcome!
Ok, if you leave the issue open, I will try circle back to it and see what I can find. Do you know of some way that I can work around the issue for now? Is there some way that I can instantiate the library in a way that forces it to clear the memory?
It seems to be specific to the running process (or maybe the current thread? To be confirmed). So unless you get this information from another process/thread, I don't see how.
Good suggestion, spawning a new thread does the trick. The underlying code still needs to be fixed, but the code below prevents the memory from growing past one invocation's worth:
use std::thread::sleep;
use sysinfo::{Disks, MINIMUM_CPU_UPDATE_INTERVAL};
fn main() {
// Memory leak when using `refresh_list`:
loop {
let handle = std::thread::spawn(move || {
let mut disks = Disks::new_with_refreshed_list();
disks.refresh_list();
disks.refresh();
disks.first().map(|disk| {
let available_space = disk.available_space();
println!("{available_space}");
});
sleep(MINIMUM_CPU_UPDATE_INTERVAL);
});
let _ = handle.join();
}
}
It'd be much simpler if we could find out where the leaked memory comes from. If anyone knows a good memory debugger for mac...
In this comment, I used cargo instruments (cargo install cargo-instruments) which uses the standard macOS tools, running the minimum example with cargo instruments -t Allocations --time-limit 10000.
In the Instruments application, on the left side, you can change from Statistics to Call Trees, which is much more helpful.
to
Then, you can traverse the tree and see where the memory is being allocated from. In the comment above, you can see in my case the lowest level user-space call was sysinfo::unix::apple::disk::get_disk_properties.
I imagine the solution would involve calling release on one of the underlying framework entities, maybe the result stored from CFURLCopyResourcePropertiesForKeys or something like that.
Thanks for the detailed explanations! I'll give it another try when I have time.
System: macOS Sonoma 14.5 sysinfo version: 0.30.12
It seems no problem.
If I run the code below for a long time, I notice after the memory grows to 50M, but will drop back to about 10M, and then increase and drop again.
loop {
let _ = Disks::new_with_refreshed_list();
sleep(MINIMUM_CPU_UPDATE_INTERVAL);
}
Hi everyone. I'm the author of the code in question here. During its original development I spent a lot of time making sure there were no CoreFoundation-induced leaks, so I took a look at this one today to triple check. And: I've concluded its not a memory leak.
I actually had to deal with a very similar problem to this in my crate arboard. In that case, the OS frameworks were caching resources as a result of Objective-C calls. The same thing is happening here (as can be seen from the multiple caching references in the Instruments stack calls. Note the Objective-C calls in the stacktrace are the ones with [] around them.
The root cause problem here is that Objective-C implicitly autoreleases a lot, and often caches will take advantage of this for automatic cleanup. The part that tipped me off here was this example spawning threads to remove the leak. On macOS every thread has its own autorelease pool stack, which its responsible for draining. In "normal" macOS apps, the main thread event loop does this for you implicitly. In Rust... it does not.
This means that if you have a single thread calling the same function extremely fast with autoreleasing objects somewhere in the callstack, you get what appears to be a memory leak. But, once the main thread exits (or each temporary thread in a loop), the thread's pools are cleaned up automatically by the OS. This is also why the leaks CLI doesn't report any leaks on the original reproducer: by the time main returns, the memory has been freed.
To demonstrate this, you can use one of the original reproducers and wrap it in a new autorelease pool per loop, which will be drained when the scope ends (at the end of each loop iteration):
let mut disks = Disks::new_with_refreshed_list();
loop {
objc2::rc::autoreleasepool(|_pool| {
disks.refresh_list();
disks.refresh();
std::thread::sleep(sysinfo::MINIMUM_CPU_UPDATE_INTERVAL);
})
}
See the with/without autorelease pool for allocation traces:
With the explanation done, how should this be fixed? IMO calling refresh disks fast enough to incur this is pathological behavior and not something everyone will do, so adding an autorelease pool to sysinfo is the wrong direction. This is because the autorelease pool is unnecessary for other uses cases like if its compiled into a proper Cocoa app or used at a slower pace. The same was true in arboard, except that was actually using Objective-C and has a tighter contract to uphold because it dealt with many objects autoretained from AppKit. Given all that, I would lean towards saying it's the job of more "specialized" applications to handle this on their own and add autorelease pools if the temporary memory spikes are non-tolerable. Of course, this is up to @GuillaumeGomez as the crate maintainer.
Thanks for the very detailed explanations!
So if I understand correctly, getting disk information in a thread would allow for the extra resources to be cleared once the thread is removed, right? That would be an acceptable solution.
Otherwise, not doing anything would be acceptable too.
Yes, that is correct 👍. At the cost of some unsafe code (or an objc2 dependency), a manual autorelease pool would incur less overhead then a full thread spawn in sysinfo, but someone would need to benchmark to see if it makes a noticeable difference.
I'd rather just use some unsafe code directly. To benchmark it, you can add a new benchmark for it in benches to make this function in particular called and then check a before/after. We can think about it once we have the results. Since I can't trigger the memory bug on my computer, not sure it'd be worth it to check it here. Wanna give it a try?
Thanks @complexspaces for the detailed response!
Just to add a response to this portion of the comment:
With the explanation done, how should this be fixed? IMO calling refresh disks fast enough to incur this is pathological behavior and not something everyone will do, so adding an autorelease pool to sysinfo is the wrong direction. This is because the autorelease pool is unnecessary for other uses cases like if its compiled into a proper Cocoa app or used at a slower pace. The same was true in arboard, except that was actually using Objective-C and has a tighter contract to uphold because it dealt with many objects autoretained from AppKit. Given all that, I would lean towards saying it's the job of more "specialized" applications to handle this on their own and add autorelease pools if the temporary memory spikes are non-tolerable. Of course, this is up to @GuillaumeGomez as the crate maintainer.
I only used a tight loop to exacerbate the issue, it would normally only be called every 15 minutes in our application and the application would crash after a couple days. I don't think this use case for the library is too foreign, for example, a toolbar application that has a thread to check disk usage seems like it would be a fairly common use case.
Either way, we will pursue the objc2::rc:: autoreleasepool approach in our codebase to mitigate the issue for now. Thank you for highlighting that solution, I wasn't aware of the method and that it could be called in this way outside of the underlying objective-C code :)
The fact that the application crashes after a few days is not acceptable for me. Rust now provides thread::scope which would allow to have a thread to get the data and then it would free the memory. If it sounds fine for both of you, I can implement it and make a release in the next hours.
I agree, a crash every few days is more problematic. Please have a look at https://github.com/GuillaumeGomez/sysinfo/pull/1344, which implements the autorelease pool solution I proposed earlier.
I would appreciate a smoke test of that branch if anyone has time too.
Published the new version with this fix.
Thanks for the work on this! I'll put it through testing on our side 🙂