xla Cannot move tensors to cpu when in a xmp spawn process

Cannot move tensors to cpu when in a xmp spawn process

Open radna0 opened this issue 4 months ago • 3 comments

🐛 Bug

all_frames = torch.cat(all_frames, dim=0).cpu().numpy()
RuntimeError: Bad StatusOr access: INTERNAL: during context [pre-optimization]: RET_CHECK failure (third_party/tensorflow/compiler/xla/service/hlo_verifier.cc:402) replica_count == 1 || n == replica_count In kCrossReplica mode, replica groups should contain 8 replicas, but found 2: %all-gather.20571 = f16[81920,8,64]{2,1,0} all-gather(f16[40960,8,64]{2,1,0} %add.20570), replica_groups={{0,1}}, dimensions={0}

To Reproduce

Steps to reproduce the behavior:

spawn a process, with xmp spawn
Move tensors to cpu using .cpu

Expected behavior

Should move tensors to cpu.

Environment

Reproducible on XLA backend [CPU/TPU/CUDA]: TPU v2-8 and v3-8
torch_xla version: nightly 2.6

Additional context

Oct 17 '24 09:10 radna0

xla xla copied to clipboard

Cannot move tensors to cpu when in a xmp spawn process

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

xla
xla copied to clipboard