Jiaxin Shan
Jiaxin Shan
Hi community, I am wondering any specific optimization did in kserve to support LLM applications? Is there a feature list?
Hi @gfanton Thanks for open source this project. I am new to QUIC and investigating whether QUIC is beneficial to gRPC. Could you also share some insights on the benefits...
Just curious does ray-llm fully leverage ray serve autoscaling (https://docs.ray.io/en/latest/serve/autoscaling-guide.html)? Seems ray serve only support `target_num_ongoing_requests_per_replica ` and `max_concurrent_queries `, As we know, LLM output varies and these are not...
Currently, it only passes cpu and memory resources into cadvisor.Interface which is not enough to mock real world cases. https://github.com/volcano-sh/kubesim/blob/f4bd53f0b81c06f72466d981c5aabf11e044b8d1/pkg/mock/kubelet/cadvisor/testing/cadvisor_fake.go#L59-L60 We should extend this to a more generic way to...
In my current company, there're few orgs/platforms like to leverage KFP. Besides multi-user KFP, I am also evaluating if it's possible to deploy KFP per namespace since users are ok...
Part of #1223, since we close it, we need a separate issue to track this feature. Support separate metadata for each namespace help us only see related artifact/executations. Currently, MLMD...
**Note:** If you have a general support question and are looking for a quicker response, please checkout our discord channel for answers from the community: https://aka.ms/dapr-discord ## In what area(s)?...
This PR enhances the flexibility of LoRa adapter artifact locations. It allows users to specify the location using either a relative path or a Hugging Face model id. - If...
Check the issue for more details. Current lora model path doesn't work and throws error in runtime FIX https://github.com/vllm-project/vllm/issues/6229 **BEFORE SUBMITTING, PLEASE READ THE CHECKLIST BELOW AND FILL IN THE...
### 🚀 The feature, motivation and pitch ### Background Based on the lora documentation [here](https://docs.vllm.ai/en/latest/models/lora.html), user has to specific the local lora path when they starts the engine. This introduces...