litgpt icon indicating copy to clipboard operation
litgpt copied to clipboard

TPU Pod Training

Open opooladz opened this issue 6 months ago • 0 comments

Hello,

I'm trying to pre-train a llama model using fabric on a TPU Pod. I have access to a few v4-32s. training on a v4-8 is trivial using pytorch XLA on TPUs but scaling to a pod is giving me issues.

LitGPT seems like the most promising pytorch XLA based framework to go with. Can you guys help me with this?

Thanks, Omead

opooladz avatar Jul 30 '24 16:07 opooladz