Jee Jee Li
Jee Jee Li
FILL IN THE PR DESCRIPTION HERE # Motivation LoRA is highly favored within the vLLM community, there are numerous LoRA-related issues and pull requests. Thanks for @Yard1 great work, we...
### Your current environment ```text Collecting environment information... PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Ubuntu 20.04.6...
The current CI LoRA tests is quite time-consuming, which hampers the development of LoRA-related features. Based on testing on my local `single 3090`, the three most time-consuming tests are: |case|time|...
Attempt to advance the task of ` VLM with Lora` as described in [#330](https://github.com/vllm-project/vllm/issues/4194#issue-2252314187), choosing [mini-cpmv2.5](https://github.com/vllm-project/vllm/blob/v0.5.4/vllm/model_executor/models/minicpmv.py#L811) as the implementation target. * [ ] Add unit tests * [ ] Analyze...
### System Info / 系統信息 torch==2.4.0 transformers==4.45.0 ### Who can help? / 谁可以帮助到您? _No response_ ### Information / 问题信息 - [X] The official example scripts / 官方的示例脚本 - [ ]...
When I test the llama model using V1, it outputs the following information, which I believe should only appear for multimodal models ```text WARNING 02-14 12:07:01 registry.py:340] `mm_limits` has already...
## Motivation Remove LoRA-related static variable `supported_lora_modules`, which not only makes our model implementation cleaner but also enables smoother LoRA support ## Work - [ ] Delete all models `supported_lora_modules`...
## Motivation Some models, such as QWEN25-VL, have modified their layer hierarchy compared to their original `transformers` implementation. This change causes quantization's skip modules to become ineffective, leading to incorrect...