DeepSeek-Coder icon indicating copy to clipboard operation
DeepSeek-Coder copied to clipboard

请问如何用VLLM部署33B

Open laisun opened this issue 2 years ago • 6 comments

会报错啊,单机A100 ,torch 2.01, transformers 4.35 key = torch.repeat_interleave(key, self.num_queries_per_kv, dim=1) RuntimeError: CUDA error: an illegal memory access was encountered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

laisun avatar Nov 10 '23 09:11 laisun

单机 A100 是几张卡?打开 CUDA_LAUNCH_BLOCKING=1 试试呢,报错在哪里?

soloice avatar Nov 28 '23 12:11 soloice

我部署后输出是乱码,有人遇到过吗

FrankWhh avatar Nov 30 '23 03:11 FrankWhh

单机 A100 是几张卡?打开 CUDA_LAUNCH_BLOCKING=1 试试呢,报错在哪里?

请问有vllm部署的教程吗?或者文件分享下文件

txy6666yr avatar Dec 10 '23 11:12 txy6666yr

请问vllm部署时如何使用多卡加载模型,使用CUDA_VISIBLE_DEVICES=0,1还是只有一张卡load了, 很奇怪,谢谢

hyperbolic-c avatar Apr 12 '24 06:04 hyperbolic-c

请问vllm部署时如何使用多卡加载模型,使用CUDA_VISIBLE_DEVICES=0,1还是只有一张卡load了, 很奇怪,谢谢

try add --tp=2 to launch argument

mklf avatar Apr 12 '24 06:04 mklf

请问vllm部署时如何使用多卡加载模型,使用CUDA_VISIBLE_DEVICES=0,1还是只有一张卡load了, 很奇怪,谢谢

try add --tp=2 to launch argument

thanks, I have solved it by set --tensor-parallel-size >1.

hyperbolic-c avatar Apr 17 '24 12:04 hyperbolic-c