Jiaxin Shan

Results 742 comments of Jiaxin Shan

## --server-side ### dependency ``` error: Apply failed with 1 conflict: conflict with "kubectl-client-side-apply" using apps/v1: .spec.template.spec.containers[name="envoy-gateway"].resources.limits.memory Please review the fields above--they currently have other managers. Here are the ways...

@andyluo7 due to some dependency issues, it's not easy to replace to `apply` that easily, we will talk with maintainers or replace to our own distribution later. Please stick to...

@kerthcet Currently, the gateway cache and the inference cache are two separate cache systems. This separation means they can get out of sync. We have contemplated synchronizing the engine and...

/cc @zhangjyr please help take a look.

@dittops Great! I will spend some time this week to review this change

@dittops I think the only part I was not that sure is the scheduling part. can you give more details?

@dittops the workflow sounds good. from the change change, I notice the lora scheduling logic has been deleted. In this case, how to select pods? ![image](https://github.com/user-attachments/assets/faf37ddc-5c70-4f33-a3a1-dd42e79d1c74)

@dittops Yeah, I think the behavior has changed a bit recently. Option 1: Schedule the LoRA model to specific pods based on the specified replicas. Option 2: Load the LoRA...

@dittops exactly. https://github.com/vllm-project/aibrix/blame/main/api/model/v1alpha1/modeladapter_types.go#L53

@dittops apologies for late response. I am recently refactoring lora work to provide better production level support. I want to merge this one first before I refactor the codes. However...