Jee Jee Li comments

Results 206 comments of


                                            Jee Jee Li

support QLoRA

@chenqianfzh Can we add more quantization type examples in qlora_example.py, such as GPT+LoRA, so that users can refer to this script to learn how to utilize LoRA on quantized model,...

support QLoRA

> @jeejeelee @Yard1 @mgoin > > I have updated the PR, addressing and resolving all the comments. Additionally, I have added the necessary unit tests. Could u please review it...

support QLoRA

> > > @jeejeelee @Yard1 @mgoin > > > I have updated the PR, addressing and resolving all the comments. Additionally, I have added the necessary unit tests. Could u...

[RFC]: Multi-modality Support on vLLM

@ywang96 Thanks for driving the integration of more MM models into VLLM. :heart_eyes: It seems that there is no plan to refactor `vision encoder` (todo in [llava](https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/models/llava.py#L5)). In my view,...

[Bug] T4 not work

It might be due to bf16. SM75 doesn't support bf16.

[Bug] Can't run Qwen2-57B-A14B-Instruct-GPTQ-Int4

GPTQ is not yet supported for this MOE Model. There is a PR in vLLM attempting to address this issue, see: https://github.com/vllm-project/vllm/pull/6502

[Feature]: Qwen 3 MoE Lora adapter support.

The main issue is that `FusedMoE` doesn't support LoRA, which is blocking this feature.

[Feature]: Qwen 3 MoE Lora adapter support.

Yes, it's a bug, I am working on fixing it

[Feature]: Qwen 3 MoE Lora adapter support.

@bi1101 can you share your lora config?

[Feature]: Qwen 3 MoE Lora adapter support.

It seems that the expert layers have been fine-tuned, which indeed makes it difficult to support LoRA in the short term.