xiangqi1997
Results
3
comments of
xiangqi1997
@irexyc Hi. 8bits for internvl-v1.5 has been released on https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5-Int8/discussions , I am trying to load using two 24G GPUs, but I failed (OOM). What param can adjust for that?...
也尝试了4卡 3090,0号卡剩余12G,其余空闲,但仍无法加载8bits模型(OOM);指令如下 `lmdeploy serve api_server ~/.cache/huggingface/hub/models--OpenGVLab--InternVL-Chat-V1-5-Int8/snapshots/872c99216b9dd5f69ea610e160dcc8692f1ab214/ --backend turbomind --server-port 1234 --tp 4 --cache-max-entry-count 0.01`