max icon indicating copy to clipboard operation
max copied to clipboard

[Feature Request] Why we can't use Q8 quants?

Open alexcardo opened this issue 1 year ago • 1 comments

What is your request?

I discovered that the only possibility to run a quantized model is to use q4 and q6 quants. Why not adding q8 quants? Seems very strange. Is there a chance to enable it?

What is your motivation for this change?

As a rule, q8 quant is the best option when you don't want the model to not losing its quality.

Any other details?

No response

alexcardo avatar Jun 07 '24 18:06 alexcardo