aphrodite-engine
aphrodite-engine copied to clipboard
[Feature]: add back exl2 support?
🚀 The feature, motivation and pitch
Why was exl2 support dropped?
Is there anything that the community can help with that is stuck?
Alternatives
No response
Additional context
No response
+1, happy to help in any way I can
The primary culprit was the upstream PR vllm-project/vllm#3977, which drastically changed how quantized layers were handled. This made working with exllamav2 extremely difficult. If someone can make the existing exl2 quantization work with the changes from that PR, it should be easier to manage.