hardfish82
Results
3
comments of
hardfish82
+1,完整报错信息如下: WARNING 06-07 15:49:56 config.py:208] gptq quantization is not fully optimized yet. The speed can be slower than non-quantized models. 2024-06-07 15:49:58,873 INFO worker.py:1724 -- Started a local Ray instance....
transformers==4.41.0 vllm==0.4.0.post1 torch==2.1.2 测试了可以加载Qwen2-72B-Instruct-AWQ,但Qwen2-57B-A14B-Instruct-GPTQ-Int4仍然不能成功。
Agree too.