Subclasses of Symbol leak
The C++ object has to keep a reference to the Python object and the Python object needs to keep a reference to the C++ object.
See https://stackoverflow.com/questions/44993726/cyclic-reference-in-python-c-extension
cc @bocklund
To avoid lots of memory leaks, create subclass objects sparingly.
If you create a single instance of a subclass object in Python, but use that object in many complex expressions, does that increase the memory leak issue, or is that still only a single object for the purposes of this discussion?
Only a single object.
@isuruf I'm looking into this a little more. It seems like the traditional solution is to use a weak reference to break the cycle.
If my understanding is correct, it looks like SymEngine's RCP doesn't implement any weak reference features. Would it make sense to use the CPython API for PyWeakref_ from the PySymbol side?
Weak references can't fix this as they have to be non-weak references to avoid garbage collecting objects that are still useful.
From the comments of the StackOverflow answer, it seems weak references would be viable. Can you help correct my misconception about which "still useful" objects could be garbage collected?
My understanding is that the Python object holds a reference to the C++ object (which has a weak reference back to the Python object), so in a vacuum, the reference count of the Python object is zero and the C++ object is one. That is:
- If someone else is holding a reference to the Python object, both objects have non-zero reference counts.
- If no one else holds a reference to the Python object, it can be garbage collected 2a. if no one else holds a reference to the C++ object, its reference count drops to zero and it can be destructed 2b. if some other C++ code hold a reference to the C++ object, it will be kept alive while it's Python counterpart is garbage collected.
(1) and (2a) seem like expected behavior, while (2b) could be an issue. Can (2b) actually occur in practice? Is the C++ object "still useful" if no one is holding a reference to the Python counterpart?
Yes, 2b is still useful.
For example, the following would segfault if it was a weak reference.
x = MySymbolClass("x", a=1)
y = x + 1
del x
print(y - 1)
We can implement whatever is needed in the RCP classes. As long as it does not slow down the pure C++ SymEngine, then there shouldn't be a problem.
Cannot the Python wrapper simply increase the "refcount" of the C++ object, and then when the Python class goes out of scope, decrease the "refcount"? That way I would think everything would work.
Cannot the Python wrapper simply increase the "refcount" of the C++ object, and then when the Python class goes out of scope, decrease the "refcount"?
That's how it works right now, but the cycle occurs because the C++ object needs to increase the refcount of the Python class too.
See https://github.com/symengine/symengine.py/pull/403 for a hacky workaround.