xiangqi1997 comments

Repositories
Issues
Comments

Results 3 comments of


                                            xiangqi1997

[Feature] quantization of internvl-chat-v1.5

@irexyc Hi. 8bits for internvl-v1.5 has been released on https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5-Int8/discussions , I am trying to load using two 24G GPUs, but I failed (OOM). What param can adjust for that?...

[Feature] quantization of internvl-chat-v1.5

也尝试了4卡 3090，0号卡剩余12G，其余空闲，但仍无法加载8bits模型（OOM）；指令如下 `lmdeploy serve api_server ~/.cache/huggingface/hub/models--OpenGVLab--InternVL-Chat-V1-5-Int8/snapshots/872c99216b9dd5f69ea610e160dcc8692f1ab214/ --backend turbomind --server-port 1234 --tp 4 --cache-max-entry-count 0.01`

[Feature] quantization of internvl-chat-v1.5

感谢回复