msgspec icon indicating copy to clipboard operation
msgspec copied to clipboard

Align `__post_init__` behaviour after `copy.copy` / `copy.deepcopy` / `copy.replace` with dataclasses

Open provinzkraut opened this issue 1 month ago • 5 comments

Fix #874.

Align Struct's __post_init__ behaviour with that of dataclasses, when created by copy.copy / copy.deepcopy / copy.replace.

  • Call __post_init__ on copy.replace / __replace__
  • Do not call post_initoncopy.copy/copy`
  • Do not call post_initoncopy.deepcopy/deepcopy`

This is a breaking change in behaviour, as structs intentionally did not call __post_init__ after a __replace__ operation. However, as this diverges from dataclass behaviour, it's probably the right thing to do.

To achieve the desired copy.deepcopy behaviour, I had to implement a new __deepcopy__ method (previously structs did not define a custom __deepcopy__.

~~An open question though is how to behave in the case of a __copy__ operation. Currently, __post_init__ isn't called there either. Intuitively, it would make sense for __copy__ and __replace__ to behave the same in regards to __post_init__, as they're similar operations (both create a new instance from an existing instance of the same type).~~

provinzkraut avatar Nov 15 '25 15:11 provinzkraut

There is no need to call __post_init__ on copy as the "validation" of the attributes already happened when creating the object. dataclasses don't call __post_init__ on copy.copy() either.

While testing the behaviour of msgspec.Structs vs dataclasses, I just noticed something odd, namely that msgspec.Struct.__post_init__ gets called on copy.deepcopy() but not on copy.copy()! dataclasses neither call __post_init__ on copy.copy() nor copy.deepcopy().

Testscript.py:

import copy
from dataclasses import dataclass, replace

import msgspec

@dataclass
class D:
    x: int
    def __post_init__(self):
        print("  - dataclass: post init called")


class M(msgspec.Struct):
    x: int
    def __post_init__(self):
        print("  - msgspec: post init called")


print("Construct objects")
d = D(1)
m = M(1)

print("Test replace")
d2 = replace(d, x=2)
m2 = msgspec.structs.replace(m, x=2)

print("Test copy.copy()")
d3 = copy.copy(d)
m3 = copy.copy(m)

print("Test copy.deepcopy()")
d4 = copy.deepcopy(d)
m4 = copy.deepcopy(m)

Output:

Construct objects
  - dataclass: post init called
  - msgspec: post init called
Test replace
  - dataclass: post init called
Test copy.copy()
Test copy.deepcopy()
  - msgspec: post init called

kramar11 avatar Nov 17 '25 13:11 kramar11

Is there a compelling reason to call that upon a deep copy? If not, then I'd prefer to also fix that in this PR.

ofek avatar Nov 23 '25 18:11 ofek

Is there a compelling reason to call that upon a deep copy? If not, then I'd prefer to also fix that in this PR.

None that I can think of. Mirroring dataclass behaviour seems to be sensible, however, removing this __post_init__ call on copy.deepcopy would be a breaking change imo, so I'm not sure how we want to go about that.

provinzkraut avatar Nov 23 '25 18:11 provinzkraut

I think breaking changes are fine since we are still sub-1.0 and we're also going to introduce others like https://github.com/jcrist/msgspec/pull/790. Both will come in the next minor release.

ofek avatar Nov 23 '25 19:11 ofek

I think breaking changes are fine since we are still sub-1.0 and we're also going to introduce others like #790. Both will come in the next minor release.

I'll update this PR accordingly then

provinzkraut avatar Nov 23 '25 20:11 provinzkraut