nano-vllm
nano-vllm copied to clipboard
How did this job achieve such excellent performance?
Today I tested your work and found that it has achieved approximately a 10% performance improvement compared to vllm on the A100. This is an elegant and outstanding job. I don't have much time to read your code for the moment. I would like to ask where you have made optimizations to achieve such excellent performance