What does this PR do?

closes: #3150

Fixes the model cache not clearing when run.yaml model list changes by implementing proper cleanup mechanisms and adds unit tests.

Not sure exactly what is with all those models as listed_from_provider. At the moment they are still there.

Test Plan

Adding 2 models with:

curl -X POST http://127.0.0.1:8321/v1/models \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "my_llm",
    "provider_model_id": "gpt-3.5-turbo-0125",
    "provider_id": "openai",
    "model_type": "llm",
    "metadata": {}
  }'

And run.yaml

models:
  - metadata: {}
    model_id: testing-model
    provider_id: openai
    model_type: llm
    provider_model_id: gpt-4o-mini

Returned models looks like:

    "data": [
        {
            "identifier": "my_llm",
            "provider_resource_id": "gpt-3.5-turbo-0125",
            "provider_id": "openai",
            "type": "model",
            "metadata": {},
            "model_type": "llm"
        },
        {
            "identifier": "testing-model",
            "provider_resource_id": "gpt-4o-mini",
            "provider_id": "openai",
            "type": "model",
            "metadata": {},
            "model_type": "llm"
        },
...

After server restart and model removed from run.yaml only the manually added one remains:

    "data": [
        {
            "identifier": "my_llm",
            "provider_resource_id": "gpt-3.5-turbo-0125",
            "provider_id": "openai",
            "type": "model",
            "metadata": {},
            "model_type": "llm"
        },
...

Aug 19 '25 08:08 Ygnas

@Ygnas are you still working on this one?

Sep 17 '25 23:09 cdoern

This pull request has been automatically marked as stale because it has not had activity within 60 days. It will be automatically closed if no further activity occurs within 30 days.

Nov 18 '25 00:11 github-actions[bot]

This pull request has merge conflicts that must be resolved before it can be merged. @Ygnas please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Nov 18 '25 00:11 mergify[bot]

llama-stack
llama-stack copied to clipboard

fix: clear model cache when run.yaml model list changes

What does this PR do?

Test Plan

llama-stack llama-stack copied to clipboard

fix: clear model cache when run.yaml model list changes

What does this PR do?

Test Plan

llama-stack
llama-stack copied to clipboard