Jee Jee Li comments

Results 209 comments of


                                            Jee Jee Li

[Usage]: Dev instructions for implementing new LoRA features on VLLM

What are the new lora features?

[BigFix] Fix the lm_head in gpt_bigcode in lora mode

I'm very sorry for missing this PR. I will look at it ASAP. Thank you.

[BigFix] Fix the lm_head in gpt_bigcode in lora mode

@maxdebayser Perhaps directly deleting `embedding_modules` would be more appropriate?

[BigFix] Fix the lm_head in gpt_bigcode in lora mode

@maxdebayser Thanks for your explanation. > But, for testing purposes I have the same model where I duplicated the weights for the lm_head and set "tie_word_embeddings": false. When I run...

[Hardware][TPU][V1] Multi-LoRA implementation for the V1 TPU backend

There's another issue that needs confirmation: whether full sharded LoRA and` add_bias` are supported. If not supported, please refer to: https://github.com/vllm-project/vllm/blob/main/vllm/worker/hpu_model_runner.py#L704-L707

[Hardware][TPU][V1] Multi-LoRA implementation for the V1 TPU backend

See: https://github.com/vllm-project/vllm/blob/main/tests/lora/test_llama_tp.py#L164

[Lora][Frontend]Add default local directory LoRA resolver plugin.

You can try sync the main branch to avoid the CI failure

[Lora][Frontend]Add default local directory LoRA resolver plugin.

It doesn't matter, if these failures are not related to this PR, we can consider force merging it.

[Hardware][TPU][V1] Multi-LoRA Optimisations for the V1 TPU backend

> It seems LogitsProcessorWithLoRA is always created even if there's no LoRA adapter that needs it, is there a reason for this? The LoRA layers in vLLM are created in...

[Hardware][TPU][V1] Multi-LoRA Optimisations for the V1 TPU backend

> > > It seems LogitsProcessorWithLoRA is always created even if there's no LoRA adapter that needs it, is there a reason for this? > > > > > >...