xla icon indicating copy to clipboard operation
xla copied to clipboard

[DDP] Add a test case to test a larger model

Open alanwaketan opened this issue 2 years ago • 1 comments

Summary: This commit adds a test case to test a larger model that can trigger multiple all_reduces instead of one.

Test Plan: XRT: MASTER_ADDR=localhost MASTER_PORT=6000 python test/test_ddp.py TestXrtDistributedDataParallel.test_ddp_correctness_large_net PJRT: PJRT_DEVICE=TPU python test/pjrt/test_ddp.py TestPjRtDistributedDataParallel.test_ddp_correctness_large_net

alanwaketan avatar Oct 10 '22 22:10 alanwaketan

Thanks Will for approving it. I will fix all the CI issues before merging.

alanwaketan avatar Oct 11 '22 17:10 alanwaketan