zhouheyun

Results 1 comments of zhouheyun

> Our open-source code ([vllm-project/vllm#4650](https://github.com/vllm-project/vllm/pull/4650)) is not the inference code used in the API platform, so it cannot achieve the throughput speed mentioned in the paper. @zhouheyun What‘s the average...