blog icon indicating copy to clipboard operation
blog copied to clipboard

Could you evaluate the performance of llama 3.1 across different quantized versions with various parameters?

Open iwaitu opened this issue 1 year ago • 0 comments

405B FP8 vs 405B int4 vs 70B vs 70B fp8 vs 70B int4

I think many people are interested in knowing what the most cost-effective deployment solution is.

iwaitu avatar Jul 28 '24 00:07 iwaitu