grok-1 OOM with A100 8*80G

How can i run the demo case with random data? I use A100 8 * 80G GPU and still OOM error I think it because I start the case with fp16 or fp32, how to use QW8Bit in random data? thanks~

Mar 18 '24 11:03 nanhexinyu

when I change float32 to int8 , it has other problem. w = hk.get_parameter( "w", [input_size, output_size], jnp.int8, init=hk.initializers.Constant(0)) raise TypeError(f"{name} argument does not appear valid. It should be a " TypeError: params argument does not appear valid. It should be a mapping but is of type <class 'model.TrainingState'>. For reference the parameters for apply are apply(params, rng, ...)`` for hk.transformandapply(params, state, rng, ...)forhk.transform_with_state`.

Mar 18 '24 12:03 nanhexinyu

Silly me, thinking that I could run Grok on my two 3090TIs :)

Mar 18 '24 13:03 jesst3r

Silly me, thinking that I could run Grok on my two 3090TIs :)傻了我，以为我可以在我的两张3090TIs上运行Grok :)

Clearly, the memory of this graphics card is still far from sufficient; it's too large!

Mar 18 '24 17:03 null0034

It will cost 65GB GPU memory in per A100 80G..

Mar 19 '24 04:03 zRzRzRzRzRzRzR

H100 SXM5 NVLink GPU x 8 $34,000.00 each ($272,000.00)

AMD 100-000000802 EPYC 9124 Genoa 9004 Series 16-core 3 GHz Server Processor × 2 $1,111.00 each (2,222.00)

24 x 64GB DDR5 4800 ECC Reg Server Compatible Memory Kit (1.5TB Total) $8,280.00

Micron MTFDKCB960TFR-1BC1ZABYYR 7450 PRO 960 GB Solid State Drive - 2.5" Internal - U.3 (PCI Express NVMe 4.0 x4) - Read Intensive - TAA Compliant $142.00 each

total $297,019.00 (without station/power units)

Mar 19 '24 16:03 atgsmsg

I can confirm that 512gb ram and 4*A100 40gb is not enough for it.

Mar 19 '24 16:03 surak

Silly me, thinking that I could run Grok on my two 3090TIs :)

you're so funny!

Mar 22 '24 11:03 xuyixun21

grok-1 grok-1 copied to clipboard

OOM with A100 8*80G

grok-1
grok-1 copied to clipboard