Xingkai Yu

Results 6 comments of Xingkai Yu

Nice!

The modification is not need because `i in range(self.experts_start_idx, self.experts_end_idx)`.

Yes, this could indeed cause problems. A more robust approach would be to: 1. Calculate the maximum number of blocks each GPU can allocate independently based on its own available...

Thank you for your interest in extending nano-vllm! As nano-vllm is primarily designed for educational purposes to demonstrate the core concepts of LLM serving, we're currently keeping the scope limited...

Thanks for reporting this issue. I suspect you might be trying to install triton on Windows which has limited platform support. This is a known issue that's been discussed in...