cpython icon indicating copy to clipboard operation
cpython copied to clipboard

3.14 regression: slot dataclasses classes leak original class

Open Tinche opened this issue 6 months ago • 3 comments

Bug report

Bug description:

While trying to test cattrs on 3.14 I ran into this issue. Here's a simple reproducer that passes on 3.13, but doesn't on 3.14.

import gc
from dataclasses import dataclass


@dataclass(slots=True)
class B:
    b: int


gc.collect()

assert [
    o for o in gc.get_objects() if hasattr(o, "__name__") and o.__name__ == "B"
] == [B]

Originally I ran into this issue with slotted attrs classes but the core problem is the same. Since slotness cannot be added to a class, dataclasses and attrs classes create a class copy with slots instead. The old class used to hang around until a GC collection happened (guess there are some reference cycles, but I never investigated thoroughly).

On 3.14, a full GC collection does not clean up the original class, causing a leak. Apart from leaking, in case of subclassing the original class will remain in the list of the parent .__subclasses__(), causing an issue.

CPython versions tested on:

3.14

Operating systems tested on:

macOS

Linked PRs

  • gh-135230

Tinche avatar Jun 06 '25 23:06 Tinche

I think this is because the annotate function for the class holds on to the original class namespace, and that namespace dict contains the descriptors B.__dict__ and B.__weakref__ that refer back to the original class. Not too sure how to fix that yet.

JelleZijlstra avatar Jun 07 '25 01:06 JelleZijlstra

This seems hard to avoid, maybe we should just live with the change? The reason is that the annotation scope needs to hold a reference to the actual class dict (for reasons explained in https://jellezijlstra.github.io/pep695#class-scopes) and the class dict contains two descriptors, for __dict__ and __weakref__, that contain references back to the class object.

In Python 3.12 and 3.13 this behavior is already observable if you use a type alias or TypeVar bound in the dataclass, with one of these class definitions:

@dataclass(slots=True)
class B:
    def f[T: int](self, x: T) -> T: ...
    b: int

@dataclass(slots=True)
class B:
    type T = int
    b: int

Both of these fail the assertion in the post above.

I put up a change in #135230 that fixes this issue, but messing with the descriptors feels like it's probably going to break something else.

JelleZijlstra avatar Jun 07 '25 03:06 JelleZijlstra

@JelleZijlstra I think you're right that the leak itself is not a big deal, and we can just live with it.

What is a bigger deal is this:

from dataclasses import dataclass


class A:
    pass


@dataclass(slots=True)
class B(A):
    pass


print(A.__subclasses__())  # [<class '__main__.B'>, <class '__main__.B'>]

This will break a piece of cattrs and anyone using __subclasses__() with slotted dataclasses and attrs classes. Maybe we can work around it in the __subclassess__() implementation somehow?

Tinche avatar Jun 28 '25 15:06 Tinche

release blocker status just to help force consideration. It may wind up deferred.

gpshead avatar Jul 20 '25 20:07 gpshead

@JelleZijlstra I think you're right that the leak itself is not a big deal, and we can just live with it.

What is a bigger deal is this:

from dataclasses import dataclass


class A:
    pass


@dataclass(slots=True)
class B(A):
    pass


print(A.__subclasses__())  # [<class '__main__.B'>, <class '__main__.B'>]

This will break a piece of cattrs and anyone using __subclasses__() with slotted dataclasses and attrs classes. Maybe we can work around it in the __subclassess__() implementation somehow?

This actually is not true if you run gc.collect() first.

>>> from dataclasses import dataclass
>>> import gc
>>> class A: pass
... 
>>> @dataclass(slots=True)
... class B(A):
...     attr: int
...     
>>> print(A.__subclasses__())
[<class '__main__.B'>, <class '__main__.B'>]
>>> gc.collect()
222
>>> print(A.__subclasses__())
[<class '__main__.B'>]

This confused me greatly when I was trying to write test cases just now, but what's going on seems to be that the __dict__ and __weakref__ slots are added only for direct subclasses of object.

In any case, I put up another possible fix in #136893. It has some potential compatibility implications, but it's something you could also do in e.g. attrs if you choose.

JelleZijlstra avatar Jul 21 '25 00:07 JelleZijlstra

So I do actually run gc.collect() first, but this trick seems to have stopped working on 3.14. I'm not on my computer at the moment, but this is what causes the cattrs test suite to fail on 3.14 and what initially motivated this issue.

Tinche avatar Jul 21 '25 09:07 Tinche

To be clear, gc.collect() doesn't help if the slotted dataclass is a direct child of object (which is probably the more common case in practice). It helps if the slotted class is a child of another Python class.

In any case, just to reiterate: I put up a proposed fix for dataclasses that should also be applicable to attrs. It's rather hacky; let me know if you think it's an acceptable solution.

JelleZijlstra avatar Jul 21 '25 13:07 JelleZijlstra

release blocker status just to help force consideration. It may wind up deferred.

https://github.com/python/cpython/pull/136960 has been merged with a fix. Let's move this to deferred to see how it goes in 3.14 RC1.

hugovk avatar Jul 22 '25 08:07 hugovk

Here's another possible direction to solve this: #136966

encukou avatar Jul 22 '25 08:07 encukou

We decided to revert the #136893 fix because modifying the class dict directly is playing with fire.

I think an approach like Petr's in #136966 is promising but it feels a bit late to do that for 3.14, because we're making changes deep in core descriptor logic. We should try that for 3.15 though.

For cases where you use introspection, there's another workaround possible, which relies on the fact that @dataclass in fact dataclassifies the non-slotted class before it creates the slots wrapper: you can filter out any classes that (a) don't have __slots__ and (b) have cls.__dataclass_params__.slots = True. This should fix use cases like https://catt.rs/en/stable/strategies.html#include-subclasses-strategy .

We could potentially make this a bit easier by setting an attribute like .__dataclass_replaced_with_wrapper_class__ on the original class.

JelleZijlstra avatar Jul 22 '25 16:07 JelleZijlstra

We have the same problem without dataclasses.

>>> class A: pass
... 
>>> class B(A): pass
... 
>>> A.__subclasses__()
[<class '__main__.B'>]
>>> del B
>>> A.__subclasses__()
[<class '__main__.B'>]
>>> import gc; gc.collect(); gc.collect()
0
>>> A.__subclasses__()
[<class '__main__.B'>]

serhiy-storchaka avatar Jul 22 '25 16:07 serhiy-storchaka

We have the same problem without dataclasses.

@serhiy-storchaka It doesn't repro if I run this script as a file (python.bat .\test.py). Maybe it related to REPL?

 [2025-07-22 21:30:28] [D:\Sources\_pythonish\cpython\main]  [main ≡ +29 ~0 -0 !] [󰅒 3:32.666]
➜ .\python.bat -i .\test.py
Running Debug|x64 interpreter...
[<class '__main__.B'>]
[<class '__main__.B'>]
[]
>>> exit()

 [2025-07-22 21:30:36] [D:\Sources\_pythonish\cpython\main]  [main ≡ +29 ~0 -0 !] [󰅒 2.23]
➜ .\python.bat .\test.py   
Running Debug|x64 interpreter...
[<class '__main__.B'>]
[<class '__main__.B'>]
[]

sergey-miryanov avatar Jul 22 '25 16:07 sergey-miryanov

@JelleZijlstra explained to me that this is because the REPL holds a reference to the result of A.__subclasses__().

serhiy-storchaka avatar Jul 22 '25 16:07 serhiy-storchaka

Yeah. After this note it looks logical. To be sure that I understand this correct: A.__subclasses__() returns list of types with strong refs. And REPL holds a strong ref to this list. So list holds an extra ref to B even if we delete it from REPL global. If we delete B and call gc.collect() before first call of __subclasses__ then it returns [].

sergey-miryanov avatar Jul 22 '25 17:07 sergey-miryanov

Yes. If you're playing with this in the REPL you can prevent the problem by using print(A.__subclasses__()) instead of A.__subclasses__().

Note that __subclasses__ may always include classes that are currently unreachable, because classes are always part of a reference cycle (even before 3.14) and therefore can only get collected when GC runs, which happens at an indeterminate time.

JelleZijlstra avatar Jul 22 '25 17:07 JelleZijlstra

may always include classes that are currently unreachable, because classes are always part of a reference cycle (even before 3.14)

I'm not sure I get you correctly. IIUC call of __subclasses__ return list that holds strong refs (from B to A via base, but not from A to B) and there are no reference cycles. tp_subclasses contains weakrefs so no reference cycles too. Can you correct me where I'm wrong?

sergey-miryanov avatar Jul 22 '25 17:07 sergey-miryanov

Yes, the subclass list is stored as a list of weakrefs (which are materialized into strong refs if you actually call __subclasses__). However, every Python class is necessarily part of a reference cycle unrelated to the subclass list; I explained why in https://github.com/python-attrs/attrs/issues/1047#issuecomment-3104006808

JelleZijlstra avatar Jul 22 '25 17:07 JelleZijlstra

Ough! Thanks, I completely forgot about MRO!

sergey-miryanov avatar Jul 22 '25 17:07 sergey-miryanov

Another attempt: #137047. This is similar to the reverted version, but no longer relies on hacks to edit the __dict__ directly. Instead, we expose a private function in sys that clears out the problematic descriptors in C code.

JelleZijlstra avatar Jul 23 '25 15:07 JelleZijlstra

If the problem is the annotate function that holds reference to the original class namespace, can we create a new annotate function that holds reference to the new class namespace? If this impossible in Python, would not a private function in sys that creates the annotate function be more safe?

serhiy-storchaka avatar Jul 23 '25 15:07 serhiy-storchaka

There are multiple annotate functions potentially involved (the one for the class annotations, those for any methods with annotations, and potentially more for TypeVar bounds in generic methods). We'd have to walk through the class dictionary to find all of them, which might not even be possible if the methods are wrapped by decorators. This will be difficult to get right and more complicated than the approach in #137047.

JelleZijlstra avatar Jul 23 '25 16:07 JelleZijlstra

Just out of curiosity, since we're now delving into C-land, would it be possible to have a C function that would rework a class in-place to make it a slotted class? I feel like it would solve a number of issues at the root.

Tinche avatar Jul 23 '25 16:07 Tinche

@Tinche yes that is probably a better long-term solution. My understanding is that it's possible in C to change "slottedness" after the fact, but it's safe only if no instances of the class have been created yet, and proving that condition is not trivial. Maybe there's a way but it's not something we can do for 3.14 at this point.

JelleZijlstra avatar Jul 23 '25 16:07 JelleZijlstra

It looks like if the annotation is a ForwardRef, the original example still fails.

With int it now runs, but this still raises the AssertionError:

import gc
from dataclasses import dataclass

@dataclass(slots=True)
class B:
    b: undefined

gc.collect()

assert [
    o for o in gc.get_objects() if hasattr(o, "__name__") and o.__name__ == "B"
] == [B]

This appears to be because ForwardRefs in dataclass fields still refer to the original class.

Intentionally keeping the old class for demonstration:

from dataclasses import dataclass, fields

class OLD_B:
    b: undefined

NEW_B = dataclass(OLD_B, slots=True)

for f in fields(NEW_B):
    owner = f.type.__owner__
    print(f"{owner is NEW_B = }")  # False
    print(f"{owner is OLD_B = }")  # True

DavidCEllis avatar Aug 12 '25 19:08 DavidCEllis

That one is likely fixable in dataclasses by only checking the annotations after we reslotify the class.

JelleZijlstra avatar Aug 13 '25 13:08 JelleZijlstra

Yes, I have a potential solution for that but I'm trying to separate it from the solution for #137530 as the current __init__ method and a generated __annotate__ function can both maintain references to the original class.

DavidCEllis avatar Aug 13 '25 14:08 DavidCEllis

In 3.14 we have #137047, a workaround; in 3.15 we have #136966, a workaround that's also an improvement (though the improvement is not directly related to the issue).

Shall we leave this open for something like allowing decorators to fiddle with __build_class__ arguments?

encukou avatar Aug 18 '25 12:08 encukou

Shall we leave this open for something like allowing decorators to fiddle with __build_class__ arguments?

I think that's a good idea but it strays a bit too far from the original issue here, so probably better in a new issue (plus, that change is going to need a PEP).

However, let's keep this open for other cases where annotations keep the unslotted class alive, like the case @DavidCEllis came up with involving ForwardRefs.

JelleZijlstra avatar Aug 18 '25 17:08 JelleZijlstra