llm-inference icon indicating copy to clipboard operation
llm-inference copied to clipboard

Support Quantized Model

Open SeanHH86 opened this issue 11 months ago • 1 comments

Support Quantized Model.
For example: https://huggingface.co/THUDM/chatglm2-6b-int4 https://huggingface.co/Qwen/Qwen1.5-72B-Chat-GPTQ-Int4

SeanHH86 avatar Mar 13 '24 06:03 SeanHH86

Inference's speed is slow

SeanHH86 avatar Apr 16 '24 01:04 SeanHH86