Jiaxin Shan
Jiaxin Shan
### Your current environment Environment Details ```text PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Ubuntu 22.04.3 LTS...
### 📚 The doc issue https://docs.vllm.ai/en/latest/models/lora.html describe the steps to load a lora model. ``` python -m vllm.entrypoints.openai.api_server \ --model meta-llama/Llama-2-7b-hf \ --enable-lora \ --lora-modules sql-lora=~/.cache/huggingface/hub/models--yard1--llama-2-7b-sql-lora-test/ ``` There're two issues...
This RFC proposes improvements to the management of Low-Rank Adaptation (LoRA) in vLLM to make it more suitable for production environments. This proposal aims to address several pain points observed...
### 🚀 The feature, motivation and pitch ``` python -m vllm.entrypoints.openai.api_server \ --model /workspace/meta-llama/Llama-2-7b-hf \ --enable-lora \ --lora-modules sql-lora=~/.cache/huggingface/hub/models--yard1--llama-2-7b-sql-lora-test/ ``` The `/v1/models` response from above setup can not expose the...
Address https://github.com/vllm-project/vllm/issues/6275 https://github.com/vllm-project/vllm/issues/6274 1. Leverage `root` and `parent` field in the ModelCard to demonstrate the lora adapter lineage. Lora command supports json format and user can specify the base model...
This is just a place holder PR for now. It's a blog post for https://github.com/kubernetes/enhancements/issues/4176 /cc @rashansmith
### Your current environment v0.5.2. vLLM env is not an issue so I will just skip the collection process ### 🐛 Describe the bug I am running benchmark tests and...
 When I create an object, I notice it surface some `KubeAPIWarningLogger` messages, seems it doesn't affect anything but it's annoying. This should be something common to other cases, anyone...
**Tell us about your request** What do you want us to build? A Nvidia Driver installer for EKS-Optimized Linux AMI. **Which service(s) is this request for?** EKS **Tell us about...
### Search before asking - [X] I had searched in the [issues](https://github.com/ray-project/kuberay/issues) and found no similar feature requirement. /cc Bytedancer @Basasuya @Yicheng-Lu-llll ### Description The recent release of the Llama...