DeepSeek-V2
DeepSeek-V2 copied to clipboard
why i use vllm inference deepseek v2 ,speed is low
i use vllm to inference deepspeed, use flask to deploy model. When the problem enters the model, it always gets stuck for a long time in the processd prompt step,the code i use is your example code
https://huggingface.co/deepseek-ai/DeepSeek-V2/discussions/1 @ZzzybEric
whats your gpu type?