EasyLM For 30B LLama model, can server be supported by configuring mesh_dims on tpu v3-8 (128g)? I tried 8,1 and 4,1 but they don't seem to work.

For 30B LLama model, can server be supported by configuring mesh_dims on tpu v3-8 (128g)? I tried 8,1 and 4,1 but they don't seem to work.

Open joytianya opened this issue 1 year ago • 2 comments

For 30B LLama model, can server be supported by configuring mesh_dims on tpu v3-8 (128g)? I tried 8,1 and 4,1 but they don't seem to work.

Apr 19 '23 14:04 joytianya

Any luck training the 30B on a single TPU v3-8 so far? Does it even fit? The 7B needs 84GB of VRAM, so I would expect the 30B to need at least 4 times that.

Sep 03 '23 15:09 versae

Currently, there is still no work.

Sep 05 '23 04:09 joytianya

EasyLM EasyLM copied to clipboard

For 30B LLama model, can server be supported by configuring mesh_dims on tpu v3-8 (128g)? I tried 8,1 and 4,1 but they don't seem to work.

EasyLM
EasyLM copied to clipboard