Cody Yu

Results 161 comments of Cody Yu

Sorry we're busying with the company event (Ray Summit) until this week. Will try to find some time after the event to look into it. @SolitaryThinker could you also take...

> @comaniac how can I trigger the CI? I have no dev env for vllm currently Does that mean you cannot verify this PR locally? We should avoid using CI...

> > > Can not work on NVIDIA Ampere GPU, for example 3090. > > > > > > Unfortunate limit of Triton > > Does [#5975](https://github.com/vllm-project/vllm/pull/5975) help for this?...

@robertgshaw2-neuralmagic we are also suffering from the illegal memory access even before this refactoring. It's weird because I didn't find this issue at v0.5.0 and it's still working for me...

> @robertgshaw2-neuralmagic @comaniac There is a potential risk of illegal memory access, I have made changes but have not yet submitted them. Please refer to:[add_device_gurad](https://github.com/jeejeelee/vllm/blob/fix-moe-kernel/csrc/moe_align_block_size_kernels.cu#L115) Interesting. Do you think the...

Thanks for the detail steps, which are helpful. In the e2e case I believe vllm would make sure all tensors are on the right device, so this shouldn't be an...

Some points per offline discussion with @ruisearch42 - This is expected and a normal termination process in Ray. The "error" log is more like for debugging purpose. - To hide...

Hmm I'm not sure we want to have benchmark/evals. For correctness checking in the CI, we should be able to just test 2-3 cases to keep the stability.