Zihan Wang
Results
1
issues of
Zihan Wang
Hi, I'm using vllm to run llama-13B on two V100-16GB GPUs. I deployed vllm with the API server. However, When the context is long, the server returns: [2023-08-09 22:39:16,002 E...