[3.13] copyreg._reconstructor failures for protocol 0 and 1 for `dateutil.tz.tzutc`
Bug description:
In [25]: pickle.load(open('taskwarrior/f91b9e98-7586-4317-ae52-b516e97209e5', 'rb'))
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[25], line 1
----> 1 pickle.load(open('taskwarrior/f91b9e98-7586-4317-ae52-b516e97209e5', 'rb'))
File /usr/lib/python3.13/copyreg.py:44, in _reconstructor(cls, base, state)
41 def _reconstructor(cls, base, state):
42 if base is object:
43 #obj = cls.__new__(cls)
---> 44 obj = object.__new__(cls)
45 else:
46 obj = base.__new__(cls, state)
TypeError: object.__new__(tzutc) is not safe, use tzutc.__new__()
It seems that cls.__new__(cls) is the correct way to achieve the desired effect for any type of object?
This issue was not present in 3.12, or at least same code on same pickles used to work.
I was not able to reproduce this with arbitrary data serialization/deserialization (e.g. pickle.dump(tzutc(), f); pickle.load(f) works), which suggests this is dependent on how exactly data is structured, and so I'm including the culprit pickle here: 74b9da0a-ab54-437f-9ab5-549e886d3c83.txt
CPython versions tested on:
3.13
Operating systems tested on:
Linux
Hi, I suspect this is related to a library you're using. The repro fails with this:
ModuleNotFoundError: No module named 'taskw_ng'
I doubt copyreg._reconstructor is the culprit, because that hasn't been touched in over two decades.
Yeah, it is: taskw-ng==0.2.7. The context here is the https://github.com/bergercookie/syncall (data sync between Asana and taskwarrior) after upgrading from 3.12 to 3.13.
I fixed it at the level of copyreg, finding information that this way of using the __new__ method was deprecated long time ago. Which is the reason I'm suggesting to change this upstream regardless. But perhaps it's the serialization that's the issue (as well).
With the above fix the result of a pickle.load() is this:
{'id': 0,
'description': 'Set up a demo environment',
'end': datetime.datetime(2024, 5, 28, 6, 15, 10, tzinfo=tzutc()),
'entry': datetime.datetime(2024, 5, 28, 6, 13, 7, tzinfo=tzutc()),
'modified': datetime.datetime(2024, 5, 28, 6, 15, 11, tzinfo=tzutc()),
'project': 'integral/vxbos',
'status': 'completed',
'uuid': UUID('74b9da0a-ab54-437f-9ab5-549e886d3c83'),
'urgency': 1.25205}
Strangely enough it doesn't contain anything outside of the standard library.
I'm looking at a diff between the culprit file and a dump of the above data, and what immediately stands out is that _reconstructor is mentioned right on top of the file:
diff -u 74b9da0a-ab54-437f-9ab5-549e886d3c83.txt tmp.pickle | head
--- 74b9da0a-ab54-437f-9ab5-549e886d3c83.txt 2025-06-06 18:06:37.516240292 +0800
+++ tmp.pickle 2025-06-06 18:44:29.165722062 +0800
@@ -1,445 +1,85 @@
-ccopy_reg
-_reconstructor
-p0
-(ctaskw_ng.task
-Task
+(dp0
+Vdescription
Update: scratch that, _reconstructor is mentioned later in the second file as well, so that's not the issue, probably the exact chain of calls is.
I don't fully understand what's going on, but it seems that some intermediate function calls are included into the pickle, and that's why it fails with this file, and doesn't with resulting data structure.
I fixed it at the level of copyreg, finding information that this way of using the new method was deprecated long time ago.
It's a Python 2 compatibility shield, as far as I can tell. But I can't see why that would suddenly break things in 3.13.
@ZeroIntensity, do you think pickle should load the attached file without error?
No idea, it's impossible to tell without seeing the code. It's most likely an issue with taskw_ng. I'd report it to them first, and then if they determine it's an issue on our end, they'll file a new report with a pure-stdlib repro.
Problem with that is taskw_ng doesn't cause this error. Furthermore it's not even in the stacktrace:
Exception was raised during program execution.
File "~/.virtualenvs/tw/bin/tw_asana_sync", line 8, in <module>
sys.exit(main())
~~~~^^
File "~/.virtualenvs/tw/lib/python3.13/site-packages/click/core.py", line 1442, in __call__
return self.main(*args, **kwargs)
~~~~~~~~~^^^^^^^^^^^^^^^^^
File "~/.virtualenvs/tw/lib/python3.13/site-packages/click/core.py", line 1363, in main
rv = self.invoke(ctx)
File "~/.virtualenvs/tw/lib/python3.13/site-packages/click/core.py", line 1226, in invoke
return ctx.invoke(self.callback, **ctx.params)
~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "~/.virtualenvs/tw/lib/python3.13/site-packages/click/core.py", line 794, in invoke
return callback(*args, **kwargs)
File "~/.virtualenvs/tw/lib/python3.13/site-packages/syncall/scripts/tw_asana_sync.py", line 233, in main
aggregator.sync()
~~~~~~~~~~~~~~~^^
File "~/.virtualenvs/tw/lib/python3.13/site-packages/syncall/aggregator.py", line 178, in sync
changes_A = self.detect_changes(self._helper_A, items_A)
File "~/.virtualenvs/tw/lib/python3.13/site-packages/syncall/aggregator.py", line 159, in detect_changes
cached_item = pickle_load(serdes_dir / item_id)
File "~/.virtualenvs/tw/lib/python3.13/site-packages/bubop/serial.py", line 20, in pickle_load
return pickle.load(f)
~~~~~~~~~~~^^^
File "/usr/lib/python3.13/copyreg.py", line 43, in _reconstructor
obj = object.__new__(cls)
object.__new__(tzutc) is not safe, use tzutc.__new__()
And I'm not sure at what point and with what version of that library was this pickle file created, but I am sure that it wasn't causing an error until the upgrade. I could report it to syncall or bubop, but looking at their code I don't see any misuse either. What such misuse could possibly even look like?
At the same time calling object.__new__(tzutc) does seem to cause an issue in both 3.13 and 2.7. Admittedly I don't understand why that should throw an error while tzutc.__new__(tzutc) shouldn't.
Can you send the code that is creating the file?
What is tzutc?
It looks like in version that created the pickle data, tzutc was implemented in Python, so its __new__ was inherited from object, and object.__new__(tzutc) created a "raw" instance of tzutc, with empty __dict__. In version that loads the pickle data, tzutc is an extension type, so object.__new__(tzutc) no longer works, because it doesn't know how to initialize internal state of tzutc.
Look what versions of the library that provides tzutc was at pickle and unpickle time.
This is an interesting problem. It cannot be solved at the user side (implementing __reduce__ or __reduce_ex__ for tzutc can help with future pickles, but not with unpickling old pickle data). Fixing it at Python side requires additional investigation if we don't want to break other user code.
This all happens only with historical pickle protocols 0 and 1. The best chance to recover your data is to use an older version of the library to unpickle it and then pickle it using a newer pickle protocol.
Can you send the code that is creating the file?
I think it's the code where pickle_dump is called from here: https://github.com/bergercookie/syncall/blob/master/syncall/aggregator.py
What is tzutc?
It's dateutil.tz.tzutc, which inherits datetime.tzinfo.
Thanks for breaking it down, @serhiy-storchaka, this roughly matches my timeline with these files and Python upgrades. Let me know if I can do more to investigate it!