Jee Jee Li comments

Results 206 comments of


                                            Jee Jee Li

[Feature]: Qwen 3 MoE Lora adapter support.

> Strongly support this proposal. From an engineering perspective, prioritizing LoRA support for only the attention layers ('q_proj', 'k_proj', 'v_proj', 'o_proj') in the initial Qwen 3 MoE integration would be...

[Bug]: Model architectures ['Qwen2AudioForConditionalGeneration'] are not supported for now.

See: https://github.com/vllm-project/vllm/issues/8394

[Bug]: Model architectures ['Qwen2AudioForConditionalGeneration'] are not supported for now.

> We don’t support Qwen2Audio yet？ Yep

[Bug]: Docker, v0.9.0.1, Gemma3-4B, "Unsupported conversion from f16 to f16" on Nvidia T4

Upgrading to triton3.4 or downgrading to 3.2 can fix this issue

[Misc] Reduce LoRA-related static variable

@DarkLight1337 Do you know what's causing the current CI failures?

[Hardware][Nvidia] Enable support for Pascal GPUs

> I really have no idea how to fix this. Any suggestions? > > Don't worry about this issue, committer can fix it directly.

[Hardware][Nvidia][Core][Feature] new feature add: vmm(virtual memory manage) kv cache for nvidia gpu

> I can't get much info using CUDA_LAUNCH_BLOCKING=1 > > ``` > ERROR 08-16 04:15:24 async_llm_engine.py:53] File "/opt/vllm/vllm/engine/async_llm_engine.py", line 247, in step_async^M > ERROR 08-16 04:15:24 async_llm_engine.py:53] output = await...

[Bugfix] Fix quantization skip modules logic

Close due to #14635

LoRA support on llama4

> > Considering that our current MoE layer doesn't support LoRA yet, llama4 may not be able to fully support LoRA > > @jeejeelee , we probably need to ask...

[Bug]: Not support MiniCPM-o 2.6 ‘s finetune lora

It looks like the error is not related to LoRA. ```shell ERROR 02-10 15:42:31 engine.py:389] Following weights were not initialized from checkpoint: {'apm.layers.18.self_attn_layer_norm.weight', 'apm.layers.13.self_attn.v_proj.weight', 'apm.layers.0.fc2.bias', 'apm.layers.16.self_attn.q_proj.bias', 'apm.layers.8.self_attn.out_proj.weight', 'apm.layers.17.self_attn.v_proj.bias', 'apm.layers.3.fc1.bias', 'apm.layers.11.self_attn.q_proj.bias',...