mistral.rs
mistral.rs copied to clipboard
Feature request: Exllamav2 (exl2) backend
It would be fantastic if mistral.rs implement an exllamav2 backend to allow loading exl2 models.
I know you're planning this, but I saw there wasn't an open feature request to track it, so I thought I'd raise one.
Adding context from a reddit comment here:
Do you have any plans to integrate exllamav2 to mistral.rs?
Yes! I am currently in the process of creating the necessary infrastructure for adding arbitrary quantization methods, starting with GPTQ integration. Once that is done, the infrastructure will be flexible enough that we can add other quants like exllamav2 or HQQ to mistral.rs rapidly.
See also:
- https://github.com/EricLBuehler/mistral.rs/issues/418