Jiaxin Shan
Jiaxin Shan
## --server-side ### dependency ``` error: Apply failed with 1 conflict: conflict with "kubectl-client-side-apply" using apps/v1: .spec.template.spec.containers[name="envoy-gateway"].resources.limits.memory Please review the fields above--they currently have other managers. Here are the ways...
@andyluo7 due to some dependency issues, it's not easy to replace to `apply` that easily, we will talk with maintainers or replace to our own distribution later. Please stick to...
@kerthcet Currently, the gateway cache and the inference cache are two separate cache systems. This separation means they can get out of sync. We have contemplated synchronizing the engine and...
/cc @zhangjyr please help take a look.
@dittops Great! I will spend some time this week to review this change
@dittops I think the only part I was not that sure is the scheduling part. can you give more details?
@dittops the workflow sounds good. from the change change, I notice the lora scheduling logic has been deleted. In this case, how to select pods? 
@dittops Yeah, I think the behavior has changed a bit recently. Option 1: Schedule the LoRA model to specific pods based on the specified replicas. Option 2: Load the LoRA...
@dittops exactly. https://github.com/vllm-project/aibrix/blame/main/api/model/v1alpha1/modeladapter_types.go#L53
@dittops apologies for late response. I am recently refactoring lora work to provide better production level support. I want to merge this one first before I refactor the codes. However...