llama-stack
llama-stack copied to clipboard
fix: clear model cache when run.yaml model list changes
What does this PR do?
closes: #3150
Fixes the model cache not clearing when run.yaml model list changes by implementing proper cleanup mechanisms and adds unit tests.
Not sure exactly what is with all those models as listed_from_provider. At the moment they are still there.
Test Plan
Adding 2 models with:
curl -X POST http://127.0.0.1:8321/v1/models \
-H "Content-Type: application/json" \
-d '{
"model_id": "my_llm",
"provider_model_id": "gpt-3.5-turbo-0125",
"provider_id": "openai",
"model_type": "llm",
"metadata": {}
}'
And run.yaml
models:
- metadata: {}
model_id: testing-model
provider_id: openai
model_type: llm
provider_model_id: gpt-4o-mini
Returned models looks like:
"data": [
{
"identifier": "my_llm",
"provider_resource_id": "gpt-3.5-turbo-0125",
"provider_id": "openai",
"type": "model",
"metadata": {},
"model_type": "llm"
},
{
"identifier": "testing-model",
"provider_resource_id": "gpt-4o-mini",
"provider_id": "openai",
"type": "model",
"metadata": {},
"model_type": "llm"
},
...
After server restart and model removed from run.yaml only the manually added one remains:
"data": [
{
"identifier": "my_llm",
"provider_resource_id": "gpt-3.5-turbo-0125",
"provider_id": "openai",
"type": "model",
"metadata": {},
"model_type": "llm"
},
...
@Ygnas are you still working on this one?
This pull request has been automatically marked as stale because it has not had activity within 60 days. It will be automatically closed if no further activity occurs within 30 days.
This pull request has merge conflicts that must be resolved before it can be merged. @Ygnas please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork