symengine.py icon indicating copy to clipboard operation
symengine.py copied to clipboard

Subclasses of Symbol leak

Open isuruf opened this issue 4 years ago • 10 comments

The C++ object has to keep a reference to the Python object and the Python object needs to keep a reference to the C++ object.

See https://stackoverflow.com/questions/44993726/cyclic-reference-in-python-c-extension

isuruf avatar Dec 01 '21 07:12 isuruf

cc @bocklund

To avoid lots of memory leaks, create subclass objects sparingly.

isuruf avatar Dec 01 '21 17:12 isuruf

If you create a single instance of a subclass object in Python, but use that object in many complex expressions, does that increase the memory leak issue, or is that still only a single object for the purposes of this discussion?

richardotis avatar Dec 01 '21 19:12 richardotis

Only a single object.

isuruf avatar Dec 01 '21 19:12 isuruf

@isuruf I'm looking into this a little more. It seems like the traditional solution is to use a weak reference to break the cycle.

If my understanding is correct, it looks like SymEngine's RCP doesn't implement any weak reference features. Would it make sense to use the CPython API for PyWeakref_ from the PySymbol side?

bocklund avatar Apr 21 '22 23:04 bocklund

Weak references can't fix this as they have to be non-weak references to avoid garbage collecting objects that are still useful.

isuruf avatar May 04 '22 01:05 isuruf

From the comments of the StackOverflow answer, it seems weak references would be viable. Can you help correct my misconception about which "still useful" objects could be garbage collected?

My understanding is that the Python object holds a reference to the C++ object (which has a weak reference back to the Python object), so in a vacuum, the reference count of the Python object is zero and the C++ object is one. That is:

  1. If someone else is holding a reference to the Python object, both objects have non-zero reference counts.
  2. If no one else holds a reference to the Python object, it can be garbage collected 2a. if no one else holds a reference to the C++ object, its reference count drops to zero and it can be destructed 2b. if some other C++ code hold a reference to the C++ object, it will be kept alive while it's Python counterpart is garbage collected.

(1) and (2a) seem like expected behavior, while (2b) could be an issue. Can (2b) actually occur in practice? Is the C++ object "still useful" if no one is holding a reference to the Python counterpart?

bocklund avatar May 04 '22 18:05 bocklund

Yes, 2b is still useful.

For example, the following would segfault if it was a weak reference.

x = MySymbolClass("x", a=1)
y = x + 1
del x
print(y - 1)

isuruf avatar May 04 '22 21:05 isuruf

We can implement whatever is needed in the RCP classes. As long as it does not slow down the pure C++ SymEngine, then there shouldn't be a problem.

Cannot the Python wrapper simply increase the "refcount" of the C++ object, and then when the Python class goes out of scope, decrease the "refcount"? That way I would think everything would work.

certik avatar May 04 '22 22:05 certik

Cannot the Python wrapper simply increase the "refcount" of the C++ object, and then when the Python class goes out of scope, decrease the "refcount"?

That's how it works right now, but the cycle occurs because the C++ object needs to increase the refcount of the Python class too.

isuruf avatar May 08 '22 00:05 isuruf

See https://github.com/symengine/symengine.py/pull/403 for a hacky workaround.

isuruf avatar May 08 '22 00:05 isuruf