inference icon indicating copy to clipboard operation
inference copied to clipboard

Qwen2.5 7b显存占用过大

Open mengxianglong123 opened this issue 1 year ago • 3 comments

为什么使用xinference启动qwen2.5 7b instruct模型(直接在页面使用qwen2.5 instruct的选项启动),显存直接占了40多个g,这是为啥,显卡是a6000 48g,是上下文长度是32k导致的吗,引擎选择的是vllm

mengxianglong123 avatar Sep 25 '24 09:09 mengxianglong123

gpu_memory_utilization默认是 0.9,可以自己调整参数

比如 max_model_len: 32768 gpu_memory_utilization: 0.8

Valdanitooooo avatar Sep 26 '24 02:09 Valdanitooooo

gpu_memory_utilization默认是 0.9,可以自己调整参数

比如 max_model_len: 32768 gpu_memory_utilization: 0.8

您好,xinference在launch模型的时候,可以指定vllm的这个参数么,在文档中没有找到

mengxianglong123 avatar Sep 26 '24 03:09 mengxianglong123

您好,xinference在launch模型的时候,可以指定vllm的这个参数么,在文档中没有找到

可以的,文档可能不完善,--gpu_memory_utilization 0.8 这样

Valdanitooooo avatar Sep 26 '24 03:09 Valdanitooooo

您好,xinference在launch模型的时候,可以指定vllm的这个参数么,在文档中没有找到

可以的,文档可能不完善,--gpu_memory_utilization 0.8 这样

gpu_memory_utilization这个参数的具体解释是什么?

bigbrother666sh avatar Oct 07 '24 10:10 bigbrother666sh

gpu_memory_utilization这个参数的具体解释是什么?

来自 vllm 的参数 https://github.com/vllm-project/vllm/blob/8eeb85708428b7735bbd1156c81692431fd5ff34/vllm/entrypoints/llm.py#L105

Valdanitooooo avatar Oct 08 '24 01:10 Valdanitooooo

thx

bigbrother666sh avatar Oct 09 '24 02:10 bigbrother666sh

This issue is stale because it has been open for 7 days with no activity.

github-actions[bot] avatar Oct 16 '24 19:10 github-actions[bot]

请问所有参数都在这个文档中吗,没看到max_model_len

---原始邮件--- 发件人: @.> 发送时间: 2024年10月9日(周三) 上午10:18 收件人: @.>; 抄送: @.@.>; 主题: Re: [xorbitsai/inference] Qwen2.5 7b显存占用过大 (Issue #2368)

thx

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>

goactiongo avatar Oct 17 '24 00:10 goactiongo

请问您是怎么成功部署QWEN2.5模型的?我通过UI界面启动,如何使用都会出现CUDA OUT OF MEMORY,即便是QWEN2.5-0.5B-INSTRUCT我在一台3090电脑上使用。我通过shell启动,会直接出线requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

ipc-robot avatar Oct 18 '24 11:10 ipc-robot

是不是显存里面本来就有其他东西,nvtop 或者 nvitop 查查

bigbrother666sh avatar Oct 20 '24 14:10 bigbrother666sh

This issue is stale because it has been open for 7 days with no activity.

github-actions[bot] avatar Oct 27 '24 19:10 github-actions[bot]

This issue was closed because it has been inactive for 5 days since being marked as stale.

github-actions[bot] avatar Nov 02 '24 19:11 github-actions[bot]