eigenLiu comments

Results 56 comments of


                                            eigenLiu

trafficstars

Qwen1.5-72B-Chat-GPTQ-Int4 能否在4*16G V100上跑起来？

我有办法。请跟进https://github.com/vllm-project/vllm/issues/4369

[Feature] 我们支持gptq量化模型的推理么

技术咨询您一下 @zhyncs 比如这个模型 https://huggingface.co/Phind/Phind-CodeLlama-34B-v2 我只有4卡v100共64G显存，想做int4量化，lmdeploy有啥方案没。

[Feature] 我们支持gptq量化模型的推理么

> #2090 adds support for both AWQ and GPTQ models on V100. great thanks to this pr！

i saw this pr merged, https://github.com/InternLM/lmdeploy/pull/2090 so i'll try this gptq model on v100: https://huggingface.co/TheBloke/Phind-CodeLlama-34B-v2-GPTQ if succeeded, i'll give a report here and close this issue. thanks to you all...

[Feature] 我们支持gptq量化模型的推理么

@zhyncs hi~~ 拉起来报错。因为060a版本在pypi上没有，所以无法pip install，我从源码安装的，执行的：解压060a zip包，并进入lmd的目录，然后： mkdir -p build && cd build bash ../generate.sh make make -j$(nproc) && make install cd .. pip install -e . 然后在各种报错的引导下，我改了模型的这几处配置： ![微信图片_20240830223657](https://github.com/user-attachments/assets/bd52c14d-2ed4-4edc-936f-8fb44881fb5c) bf16改为fp16、量化配置group_size从-1改为了128、desc_act...

eigenLiu

Qwen1.5-72B-Chat-GPTQ-Int4 能否在4*16G V100上跑起来？

[Feature] 我们支持gptq量化模型的推理么

[Feature] 我们支持gptq量化模型的推理么

[Feature] 我们支持gptq量化模型的推理么

[Feature] 我们支持gptq量化模型的推理么

[Feature] 我们支持gptq量化模型的推理么

[Feature] 我们支持gptq量化模型的推理么

[Feature] 我们支持gptq量化模型的推理么

[Feature] V100量化推理

[Feature] V100量化推理