llm_gpu_cal Why 13.3?

Why 13.3?

Open adamlin120 opened this issue 1 year ago • 0 comments

Thank you for the cool project!

Could you please elaborate more on how to come up with 13.3?

My understanding is that

Number of GPU needed is Training FLOPs / (Theoretical FLOPS * MFU ) = 6 * Model Size * # Tokens / (Theoretical FLOPS * MFU )?

https://github.com/hunkim/llm_gpu_cal/blob/70484daf659ac0d3175adcb9f4c16cfc183071c6/app.py#L6C54-L6C54

Jan 18 '24 02:01 adamlin120