blog
blog copied to clipboard
Update mixtral.md
Exllama kernels using GPTQConfig for faster inference and production load. @davanstrien @younesbelkada @pcuenca