EasyLM
EasyLM copied to clipboard
For 30B LLama model, can server be supported by configuring mesh_dims on tpu v3-8 (128g)? I tried 8,1 and 4,1 but they don't seem to work.
For 30B LLama model, can server be supported by configuring mesh_dims on tpu v3-8 (128g)? I tried 8,1 and 4,1 but they don't seem to work.
Any luck training the 30B on a single TPU v3-8 so far? Does it even fit? The 7B needs 84GB of VRAM, so I would expect the 30B to need at least 4 times that.
Currently, there is still no work.