cookbook
cookbook copied to clipboard
Add missing `--ffn-expansion-factor` to FLOPs calculator script
As per the title. This arg is in the other two scripts but was missing for calc_transformer_flops.py
Note that now, to get Llama FLOP numbers one should pass 8/3 and --swiglu for ffn expansion factor.
IMO it'd be preferable to have the default be that each of the 3 swiglu FFN weights are 2/3 times the FFN expansion factor in size instead of multiplying flops by 3/2, so this way Llama-esque settings (or any GLU model where the ffn expansion factor is 8/3) are the default when using --swiglu.
Reviewed and merging. @haileyschoelkopf -- I'd love it if you make a followup PR with your suggested llama changes.