data icon indicating copy to clipboard operation
data copied to clipboard

Broken tests for pickling circular reference DataPipe with dill

Open ejguan opened this issue 2 years ago • 3 comments

🐛 Describe the bug

During the work to fixing the problem with unhashable DataPipe in https://github.com/pytorch/pytorch/pull/80509, I find this test is broken: https://github.com/pytorch/pytorch/blob/e266bea79395399d60bd3c684545f69ae6900236/test/test_datapipe.py#L2307-L2349

The failure is: TypeError: cannot pickle 'PyCapsule' object

I tried to remove all the DataPipe reference from the CustomDataPipe and LambdaIterDataPipe, the Error still persisted. I am not sure what would be the root cause.

I will disable the test in my PR but we need to investigate the culprit.

Versions

master branch on PT

ejguan avatar Jun 30 '22 22:06 ejguan

cc: @NivekT

ejguan avatar Jun 30 '22 22:06 ejguan

Looks like dill trying to pickup the entire torch module (or a smaller part of it). One of the options is to make functions in the test pickable, or move them to helper file.

VitalyFedyunin avatar Jul 06 '22 19:07 VitalyFedyunin

@ejguan and I did some digging into this:

dill 0.3.4 works with pytest, but not python (it raises TypeError: cannot pickle '_abc._abc_data' object).

dill 0.3.5 doesn't work with pytest or python (it raises TypeError: cannot pickle 'PyCapsule' object). It doesn't work even when we switch out LambdaIterDataPipe with CustomIterDataPipe, so the issue doesn't necessarily have to do with lambda usages.

The issue seems to be related to dill and circular reference?

NivekT avatar Jul 15 '22 18:07 NivekT