Model cache is not cleared when model list in run.yaml is changed

Open bparees opened this issue 4 months ago • 8 comments

System Info

Using llama-stack 0.2.17, I don't think any other deps/levels come into play in what i've observed.

Information

[ ] The official example scripts
[ ] My own modified scripts

🐛 Describe the bug

I added this to my run.yaml:

models:
  - model_id: my_llm
    provider_id: openai
    model_type: llm
    provider_model_id: gpt-4o-mini

I started llamastack and queries the models endpoint, I see the model in the list:

{
  "data": [
    {
      "identifier": "my_llm",
      "provider_resource_id": "gpt-4o-mini",
      "provider_id": "openai",
      "type": "model",
      "metadata": {},
      "model_type": "llm"
    },
    {
      "identifier": "openai/gpt-4-turbo",
      "provider_resource_id": "gpt-4-turbo",
      "provider_id": "openai",
      "type": "model",
      "metadata": {},
      "model_type": "llm"
    },
    .............

I then delete the model from the run.yaml and restart llamastack, the model still appears in the list:

{
  "data": [
    {
      "identifier": "my_llm",
      "provider_resource_id": "gpt-4o-mini",
      "provider_id": "openai",
      "type": "model",
      "metadata": {},
      "model_type": "llm"
    },
    {
      "identifier": "openai/gpt-4-turbo",
      "provider_resource_id": "gpt-4-turbo",
      "provider_id": "openai",
      "type": "model",
      "metadata": {},
      "model_type": "llm"
    },
    ............

I then removed the .llama directory and restarted llamastack, the model no longer appears in the list.

{
  "data": [
    {
      "identifier": "openai/gpt-3.5-turbo-0125",
      "provider_resource_id": "gpt-3.5-turbo-0125",
      "provider_id": "openai",
      "type": "model",
      "metadata": {},
      "model_type": "llm"
    },
   ............

Error logs

no error in the logs, it's not a failure per se.

Expected behavior

If i update the run.yaml to remove a model, i would not expect to see that model listed in the models endpoint.

Some discussion on the community call lead to a proposal to compute a hash of the run.yaml and clear the DBs on startup if the run.yaml hash doesn't match the hash stored in the DB. Possibly that could be even further refined to compute a hash only based on the parts of run.yaml that contribute to the available models list (e.g. the providers section and the models section)

Aug 14 '25 17:08 bparees