Chanjun

Results 28 comments of Chanjun

what version you used? marlin? gemm? gemv? gemv_fast?

> > what version you used? marlin? gemm? gemv? gemv_fast? > > I use gemm thanks

can provide a determinism version?

i also meet this problems, and if the error happends on workers, it can make the service feel stuck sometimes hi @ruisearch42 , i think this error may be the...

It's strange, the shape of `self.intermediate_tensors` has become smaller. Could the CUDA graph be modifying it? @WoosukKwon hi, could you help?

@ruisearch42 hi, itest with 0.8.2, but in my env(8 standalone machines, 2 gpus each),the service will crash soon https://github.com/vllm-project/vllm/issues/15102#issuecomment-2764948930 I test with commit dc74613fa26b04e2664b41b3d3441136eb4534a6, would get this runtime error, even...

@ruisearch42 hi, in my latest tests, the runtime error still exists. and i found it may be ray bug with ray compiled graph. the worker 0 recv scheduler_output and intermediate_tensors...

> Hi [@MichoChan](https://github.com/MichoChan) , what is worker 0 and what is master 0? Can you share the whole command you used to launch vllm? > > Also your code format...

> [@MichoChan](https://github.com/MichoChan) are you trying TP=16, or TP=8 PP=2? > > 0 is not a proper value for `VLLM_USE_RAY_COMPILED_DAG_NCCL_CHANNEL`, so it's probably not taking effect. Can you remove this env...