PiPPy
PiPPy copied to clipboard
Hard to debug issue when passing a DTensor to spmd.distribute_tensor() (cuda/nccl only)
Passing a DTensor into spmd.distribute_tensor , or more specifically, into DeviceMesh, will cause issues
- in device_mesh.broadcast, it will cause an assert to fail deep into torch code
- in device_mesh.scatter, it will cause an invalid free in the CachingAllocator.
This is most likely related to tensor sub-classing corner cases.
We should at least check that no DTensor is passed into DMesh for now.