pyosmium icon indicating copy to clipboard operation
pyosmium copied to clipboard

Cyclic references make pyosmium crash with "Node callback keeps reference to OSM object. This is not allowed."

Open stalker314314 opened this issue 4 years ago • 3 comments

As explained in task: https://phabricator.wikimedia.org/T261162 Seems that, if there is cyclic reference in frame, pyosmium would complain and crash program. It is not proper solution to do gc.collect from client program. Is there a way to ignore this check somehow?

stalker314314 avatar Aug 24 '20 20:08 stalker314314

So the root problem here is that pywikibot keeps an exception frame, which in turn keeps a reference of the function frame including the n parameter and the pywikibot object (which creates the cycle). Pyosmium only cares about the fact that this n parameter is kept. It doesn't care about anything else. Minimal example to reproduce:

class BadHandler(o.SimpleHandler):

    def node(self, n):
        try:
            raise RuntimeError("Whatever")
        except RuntimeError as e:
            remember_me = e

I'm not convinced that this is how exceptions are supposed to be used. We probably could do an additional GC run when an illegal reference was found but that will make your code very slow without any warning. I'd just del entry at the end of the function to make sure the evil exception holder is gone. This breaks the cycle, so there is no further GC run necessary.

lonvia avatar Aug 24 '20 22:08 lonvia

So the root problem here is that pywikibot keeps an exception frame, which in turn keeps a reference of the function frame including the n parameter and the pywikibot object (which creates the cycle)

The root problem is that an object (the page) keeps a reference to the an exception, that is created with the object as an argument. That is the cycle. The exception in this cycle then keeps a reference to the traceback which would then reference the frame.

del entry doesn't work because the the cycle containing the frame is broken, but frame is still referred to by the cycle of exception and the exception holder. The simplest workaround would be del n so that the frame, even if referred to by the cycle, would not reference the OSM object.

Though, yes, I do agree that exception holding is unconventional (and perhaps non-Pythonic?), and would cause surprises in memory usage and reference counting, however I can't think of a better way to show that 'oh hey this raised this exception before'. Ideally we would only save the inner frames and reconstruct the stack trace with new outer frames and saved inner frames, however, frames are not mutable and cannot be instantiated....

zhuyifei1999 avatar Aug 25 '20 08:08 zhuyifei1999

I ran into the same problem in Google Colab, after interrupting the execution of a notebook cell (which raises a KeyboardInterrupt exception).

tibbe avatar Sep 07 '21 12:09 tibbe

The new code without the reference check is now available as a pre-release. Install with pip install --pre osmium. Please test this and report if there are any regressions.

lonvia avatar Dec 14 '22 09:12 lonvia