torchtitan icon indicating copy to clipboard operation
torchtitan copied to clipboard

Benchmark SymmMem's all_to_all_vdev_2d on NVL72

Open syed-ahmed opened this issue 2 months ago • 2 comments

  • We need to check the functionality of all_to_all_vdev_2d on NVL72 and document anything missing.
  • Benchmark if the kernel's meeting the expected NVLink bandwidth.
  • Integrate into torchtitan for EP communication.

CC: @kwen2501

syed-ahmed avatar Oct 17 '25 21:10 syed-ahmed

@syed-ahmed fyi I had a draft PR on integrating https://github.com/pytorch/torchtitan/pull/1569 I hit some issues and haven't got time to revisit since then.

tianyu-l avatar Oct 17 '25 21:10 tianyu-l

Thanks @tianyu-l ! I'll try to take a look at your PR.

syed-ahmed avatar Oct 21 '25 17:10 syed-ahmed