Gu Wei
Results
2
issues of
Gu Wei
Can I use ByteTransformer to train TransFormer models on GPUs , currently supports which models ?
**Your question** Ask a clear and concise question about Megatron-LM. https://pytorch.org/docs/stable/distributed.tensor.parallel.html Why not use tensor parallel APIs of pytorch
stale