godot-cpp
godot-cpp copied to clipboard
Accessing Dictionary from inside of Array results in a memory leak
Godot version
4.1.1
godot-cpp version
4.1
System information
Fedora Silverblue 38
Issue description
This function results in a memory leak.
Recreating this function line by line in GDScript doesn't result in memory leak.
I've tried to return Dictionary as is:
Which works fine.
Callable and callv()
are affected too.
Steps to reproduce
N/A
Minimal reproduction project
gdextension-memory-leak-main.zip
You can also view it here: https://github.com/Lasuch69/gdextension-memory-leak
I'd like to add that in this function, leak is also present. It seems like, accessing
Dictionary
from Array
or TypedArray
in any way leaks memory.
Thanks for the report!
Can you explain how you're detecting the memory leak?
Memory graph rises indefinitely. I can check it with Valgrind like I did originaly.
Edit:
Valgrind confirms my theory.
I've also hit this memory leak accessing an array of arrays. I narrowed it down by commenting and retrying >,<
/* Check if test input conflicts with other entries within a given column of the playfield */
bool GridCalc::valid_in_grid_col(const Array &grid, int col, int row, int row_count, int test)
{
// std::vector<int> grid_col = (std::vector<int>)grid[col];
const Array &grid_col = (const Array &)grid[col];
godot::UtilityFunctions::print(vformat("GridCalc::valid_in_grid_col Checking grid_col vector: %d", (int)grid_col[0]));
for (int check = 0; check < row_count; check++)
{
if (check == row)
{
continue;
}
// if ((int)grid_col[check] == test)
// {
// return false;
// }
}
return true;
}
grid
is a 2d array of ints (0-9) and my function is one of several that checks if a test value has any conflicts in the grid along the column. I've gone through multiple iterations of trying to cast the sub-array as an Array or a Vector and the memory graph continually climbs when the code is invoked, until the system runs out of memory and crashes.
Based on my testing (left the commented code in for reference), all it takes is initializing the grid_col
variable to trigger the leak.
Here is the end of my valgrind log (full log is >100mb, I can pastebin it if desired):
==98178== 25,468,800 bytes in 318,360 blocks are definitely lost in loss record 2,588 of 2,589
==98178== at 0x86050C5: malloc (vg_replace_malloc.c:442)
==98178== by 0x4E316E2: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178== by 0x4758CD1: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178== by 0xF419C15: godot::Variant::operator godot::Array() const (in ~/git/godot-sudoku/GDExtension/bin/libgridcalc.linux.template_debug.x86_64.so)
==98178== by 0xF40252C: godot::GridCalc::valid_in_grid_table(godot::Array const&, int, int, int, int, int) (in ~/git/godot-sudoku/GDExtension/bin/libgridcalc.linux.template_debug.x86_64.so)
==98178== by 0xF403209: godot::MethodBindTR<bool, godot::Array const&, int, int, int, int, int, int, int>::ptrcall(void*, void const* const*, void*) const (in ~/git/godot-sudoku/GDExtension/bin/libgridcalc.linux.template_debug.x86_64.so)
==98178== by 0x129A8FD: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178== by 0x118EB27: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178== by 0x47DEFBA: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178== by 0x4640C79: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178== by 0x128B573: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178== by 0x118EB27: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178==
==98178== 53,375,200 bytes in 667,190 blocks are definitely lost in loss record 2,589 of 2,589
==98178== at 0x86050C5: malloc (vg_replace_malloc.c:442)
==98178== by 0x4E316E2: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178== by 0x4758CD1: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178== by 0xF419C15: godot::Variant::operator godot::Array() const (in ~/git/godot-sudoku/GDExtension/bin/libgridcalc.linux.template_debug.x86_64.so)
==98178== by 0xF402419: godot::GridCalc::valid_in_grid_row(godot::Array const&, int, int, int, int) (in ~/git/godot-sudoku/GDExtension/bin/libgridcalc.linux.template_debug.x86_64.so)
==98178== by 0xF4026EC: godot::GridCalc::valid_in_grid(godot::Array const&, int, int, int, int, int, int, int) (in ~/git/godot-sudoku/GDExtension/bin/libgridcalc.linux.template_debug.x86_64.so)
==98178== by 0xF403209: godot::MethodBindTR<bool, godot::Array const&, int, int, int, int, int, int, int>::ptrcall(void*, void const* const*, void*) const (in ~/git/godot-sudoku/GDExtension/bin/libgridcalc.linux.template_debug.x86_64.so)
==98178== by 0x129A8FD: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178== by 0x118EB27: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178== by 0x47DEFBA: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178== by 0x4640C79: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178== by 0x128B573: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178==
==98178== LEAK SUMMARY:
==98178== definitely lost: 103,695,444 bytes in 1,296,367 blocks
==98178== indirectly lost: 1,780 bytes in 8 blocks
==98178== possibly lost: 171,184 bytes in 827 blocks
==98178== still reachable: 670,687 bytes in 3,406 blocks
==98178== suppressed: 0 bytes in 0 blocks
==98178==
==98178== For lists of detected and suppressed errors, rerun with: -s
==98178== ERROR SUMMARY: 176 errors from 176 contexts (suppressed: 0 from 0)
* NB: I ran valgrind with 4.1.3, but 4.2 hasn't alleviated the issue.
I tried the MRP in the original description, and while Valgrind isn't reporting anything useful to me, I do see the memory rising constantly when this code is included:
# Leaks
var data: Dictionary = get_data()
... but the memory remains stable if I comment that out.
So, I'm seeing the leak too!
System Info:
- Godot v4.2.1.stable - Windows 10.0.22621 - Vulkan (Forward+) - dedicated NVIDIA GeForce GTX 980 Ti (NVIDIA; 31.0.15.3598) - 13th Gen Intel(R) Core(TM) i7-13700K (24 Threads)
- GDExtension built with MSVC using godot-cpp 4.2.1 branch.
I have experienced a similar issue that I believe shares the same root cause. From looking at the allocation call stack of my example and the original example, it seems that the leak is caused by casting from Variant
to Dictionary
, Array
, or TypedArray
(and possibly other classes), but only in GDExtension.
The following code leaks, but only in C++ GDExtension. If I compile this code as a part of the main Godot source, it does not leak:
Array original_array;
for (int i = 0; i < 1000; i++) {
Variant my_variant = original_array;
Array my_array = my_variant; // This line causes memory to leak
}
The leak also happens if I used Dictionary
or TypedArray
instead of Array
. I am able to see the leak in both the Godot editor as well as in Visual Studio when debugging my GDExtension:
Visual Studio Allocation Call Stack
Here is the Visual Studio Allocation Call Stack (as seen using a local MSVC build of Godot 4.2.1):
Comparison to Core Engine
I see that GDExtension uses this sort of cast operator in variant.cpp
of godot-cpp:
Variant::operator Array() const {
Array result;
to_type_constructor[ARRAY](result._native_ptr(), _native_ptr());
return result;
}
...which is introduces additional code paths compared to the variant.cpp
of the core engine:
Variant::operator Array() const {
if (type == ARRAY) {
return *reinterpret_cast<const Array *>(_data._mem);
} else {
return _convert_array_from_variant<Array>(*this);
}
}
My example of the leak does not trigger in the core engine because the Variant
is correctly checked to see if it's already an Array
, and because it is one, it simply does the return *reinterpret_cast<const Array *>(_data._mem);
and everything's a-OK.
...But in GDExtension, although it seems like it does this check, it still eventually ends up at _p = memnew(ArrayPrivate);
in Array::Array()
to make a whole new Array
instead of just referencing the old one. I'm still not sure why the reference counting fails and the leak happens, though...
It's possible this leak has some similarities to a chain of leaks from way back when ( https://github.com/godotengine/godot-cpp/pull/490 https://github.com/godotengine/godot-cpp/pull/356 https://github.com/godotengine/godot-cpp/pull/355). Is it possible some of these sorts of issues were reintroduced with e4ed48976a962b67e9585cc2d20d11f115ef7949 or some other major change? (I'm just trying to do some git log detective work here 🕵️)
~~I don't have a good grasp of C++ rvalues yet, so I'm not sure I can debug this any further on my own...~~
I think this is a good time for me to learn about rvalues, Variant
casting, Array
ref counting, and how the GDExtension ABI works, so give me a week or two and I feel like I should be able to at least figure out exactly what's going on.
Leak Details
Here's a very boiled down version of the leak:
Array original_array;
Variant my_variant = original_array;
Array my_array = my_variant; // This line causes memory to leak
The cast that happens during that assignment creates an additional Array
object that has its refcount
go up to 2 and then back down to 1, but it never reaches 0. Here's a part of why that happens:
- GDExtension's
Variant::operator Array() const
is called - GDExtension's
Array::Array()
is called, which creates a couple of new temporaryArray
: -
Array::Array(const Array &p_from)
is called, which allocates a newArrayPrivate
. Therefcount
is now 1. Let's call this ref "A" -
Array::Array(const Array &p_from)
is called, which adds to the refcount, which is now at 2. Let's call this ref "B" -
Array::~Array()
is called, which unreferences and takes the refcount down to 1. I've analyzed the memory and found that this is clearing out ref "A". -
Array::Array(const Array &p_from)
is called, which overwrites the_p
pointer to theArrayPrivate
originally held by ref "B"!
Eventually another call to GDExtension's Array::~Array()
is called, but _p
is nullptr. I'm not sure what this is about, but I don't think it's the issue.
It seems clear that ref "B" in the above scenario is never unreferencing and causing the leak...
Alright, I understand what's happening now and I might have a fix(?) Here are the details:
I am going to use the Dictionary
class for this example, as that is the class that was described in this original github issue, but this applies identically to the Array
and TypedArray
classes, and possibly others.
The leak happens when calling the Dictionary
cast operator on GDExtension's Variant
:
Dictionary original_dictionary;
Variant my_variant = original_dictionary;
Dictionary my_dictionary = my_variant; // The cast operator of GDExtenion's Variant causes a leak
When this cast happens, a temporary Dictionary
will be created and then a placement new will be performed on this temporary Dictionary
. Here's what GDExtension's current Variant
cast operator looks like:
Variant::operator Dictionary() const {
Dictionary result;
to_type_constructor[DICTIONARY](result._native_ptr(), _native_ptr());
return result;
}
The to_type_constructor[DICTIONARY](result._native_ptr(), _native_ptr());
line seems to perform a placement new using result
, which has already been initialized with a DictionaryPrivate
. When the placement new happens, it calls the Dictionary
copy constructor, which reasonably assumes that the Dictionary
has not been initialized and overwrites the DictionaryPrivate
pointer with the new one, causing the old one to leak:
Dictionary::Dictionary(const Dictionary &p_from) {
_p = nullptr;
_ref(p_from);
}
If calling to_type_constructor
with those two _native_ptr()
will always perform a placement new, then a solution might be to do something like calling Dictionary::~Dictionary()
, to make sure that the Dictionary
has cleaned up its references and is ready to have its copy constructor called...
Variant::operator Dictionary() const {
Dictionary result;
result.~Dictionary(); // Adding this line fixes the leak
to_type_constructor[DICTIONARY](result._native_ptr(), _native_ptr());
return result;
}
I have tested it and found that it successfully fixes the leak that was demonstrated in the original issue as well as the leak that I was experiencing.
But, this seems sub-optimal to be constructing a Dictionary
just to destruct it immediately afterwards. And maybe even risky to be calling a destructor on an object with the expectation that it will always put everything in a state where it is valid to call the copy constructor again... So I'm currently looking into seeing if there is a safe way to do this by allocating some memory on the stack and then returning it once it has been filled in with the Dictionary
data...
@allenwp Thanks for digging into this!
Can you try my draft PR https://github.com/godotengine/godot-cpp/pull/1378 and see if it fixes the leak for you?
I tested that it passed the automated tests, but I didn't have a chance to actually re-test the leak. If it works, I can try to use it in all the other applicable situations too.
Thanks @dsnopek!
Unfortunately it looks like this method leaks (it leaves an ArrayPrivate
in memory with a refcount of 1). I'm going to look into why this is, probably tomorrow.
Unfortunately it looks like this method leaks (it leaves an ArrayPrivate in memory with a refcount of 1). I'm going to look into why this is, probably tomorrow.
The draft PR only did Dictionary
, given that that was the topic of your last comment, so if you were testing Array
that could certainly be why. :-) I'll update the PR to do Array
too.
Oh, sorry, my mistake on my comment — I adapted your PR code to Array
, since that’s what I’ve been testing with, so that’s why I typed that. So I believe the issue I described does exist for all of these internally reference counted types.
Your proposed approach seems to work similar to some code I was fiddling with using uninitialized memory, but there seems to be an extra constructor call that I don’t quite understand yet.
EDIT: I've moved discussion of a fix to the PR draft #1378.
I've tried the original post's minimal reproduction project with the 4.2.2 godot-cpp release and:
Godot v4.2.2.stable - Windows 10.0.22631 - Vulkan (Forward+) - dedicated NVIDIA GeForce GTX 980 Ti (NVIDIA; 31.0.15.3598) - 13th Gen Intel(R) Core(TM) i7-13700K (24 Threads)
And it is no longer leaking:
I also tried with the old versions:
Godot v4.1.1.stable - Windows 10.0.22631 - Vulkan (Forward+) - dedicated NVIDIA GeForce GTX 980 Ti (NVIDIA; 31.0.15.3598) - 13th Gen Intel(R) Core(TM) i7-13700K (24 Threads)
This issue can now be closed! :)
Thank you for updating!