Isotr0py comments

Results 7 comments of


                                            Isotr0py

[Hardware][Intel] Add LoRA adapter support for CPU backend

### Test Result ```python3 from vllm import LLM from vllm import SamplingParams from vllm.lora.request import LoRARequest llm = LLM("meta-llama/Llama-2-7b-hf", enable_lora=True) sql_lora_path = "yard1/llama-2-7b-sql-lora-test" prompts = [ "[user] Write a SQL...

82713e9 update using more vram in lora training?

Maybe 2 lines in `train_network.py` cause this problem: ```python # unnecessary, but work on low-ram device text_encoder.to("cuda") unet.to("cuda") ``` These code just used to reduce RAM usage in low-ram environment...

[Hardware][Intel] Add LoRA adapter support for CPU backend

@zhouyuan I have rebased the code. The native lora kernel should work again.

[RFC]: Multi-modality Support on vLLM

Generally, I agreed with @DarkLight1337's opinion about moving processing logics out from `Engine` to prevent modifying core code frequently. However, I think it's difficult to keep the processing logics fully...

[RFC]: Multi-modality Support on vLLM

> @Isotr0py Perhaps we could follow a registry pattern and have each model separately register how to preprocess the inputs? If the model does not do so, then the default...

[RFC]: Multi-modality Support on vLLM

Regarding #4228, I think there may be a situation that some MM models don't have a Processor implemented. >In this case, we would have to refactor the computation of attention...

[RFC]: Multi-modality Support on vLLM

> How should we ensure that our implementation is loaded instead of the HuggingFace one? I think we can refer to `get_config()` in `transformers_utils/config.py`, but searching registried processor firstly then...