lmdeploy icon indicating copy to clipboard operation
lmdeploy copied to clipboard

[Feature] Support QuaRot quantization scheme

Open serser opened this issue 10 months ago • 1 comments

Motivation

QuaRot is out https://arxiv.org/abs/2404.00456 for three weeks. Preliminary results are convincing. Also see discussions in llama.cpp with the QuaRot authors. It would be amazing to have it supported in LMDeploy as default.

Best.

Related resources

https://github.com/ggerganov/llama.cpp/issues/6444 https://arxiv.org/abs/2404.00456

Additional context

No response

serser avatar Apr 24 '24 12:04 serser

@pppppM @AllentDan @lzhangzz may investigate QuaRot quantization algorithm, very promising

lvhan028 avatar Apr 26 '24 04:04 lvhan028