cookbook icon indicating copy to clipboard operation
cookbook copied to clipboard

Add missing `--ffn-expansion-factor` to FLOPs calculator script

Open haileyschoelkopf opened this issue 1 year ago • 1 comments

As per the title. This arg is in the other two scripts but was missing for calc_transformer_flops.py

haileyschoelkopf avatar Mar 25 '24 23:03 haileyschoelkopf

Note that now, to get Llama FLOP numbers one should pass 8/3 and --swiglu for ffn expansion factor.

IMO it'd be preferable to have the default be that each of the 3 swiglu FFN weights are 2/3 times the FFN expansion factor in size instead of multiplying flops by 3/2, so this way Llama-esque settings (or any GLU model where the ffn expansion factor is 8/3) are the default when using --swiglu.

haileyschoelkopf avatar Mar 25 '24 23:03 haileyschoelkopf

Reviewed and merging. @haileyschoelkopf -- I'd love it if you make a followup PR with your suggested llama changes.

Quentin-Anthony avatar Aug 11 '24 12:08 Quentin-Anthony