zhouheyun
Results
1
comments of
zhouheyun
> Our open-source code ([vllm-project/vllm#4650](https://github.com/vllm-project/vllm/pull/4650)) is not the inference code used in the API platform, so it cannot achieve the throughput speed mentioned in the paper. @zhouheyun What‘s the average...