llm-foundry icon indicating copy to clipboard operation
llm-foundry copied to clipboard

Inference Speed Benchmark

Open RaiAmanRai opened this issue 2 years ago • 0 comments

❓ Question

Hi, I am looking for the matric to compare the inference speed of the 7B, 13B and 70B models. More precisely I am looking for something like "X Tokens/sec" on "M x N Type GPUs".

RaiAmanRai avatar Jun 27 '23 05:06 RaiAmanRai