Version inconsistency during startup: built venv switches to local git version during server launch
== scroll down to see original description ==
When running ./built/bin/llama stack run, the process initially uses the code from the "built" venv, which contains the released llamastack library(not the version from git). This version lacks some of the newer newer code
However, the process then executes: bash /home/derekh/workarea/llama-stack/built/lib64/python3.13/site-packages/llama_stack/distribution/start_stack.sh venv built 8321 --config /home/derekh/workarea/llama-stack/llama_stack/templates/starter/run.yaml
Which subsequently runs: python -m llama_stack.distribution.server.server --config /home/derekh/workarea/llama-stack/llama_stack/templates/starter/run.yaml --port 8321
During this process handoff, the system switches from using the released llamastack library in the "built" venv to using the llamastack library from my local git repository.
This causes inconsistent behaviour - startup initially uses the older released code, while later the server uses the newer git version.
The problem can be worked around by setting "LLAMA_STACK_DIR=$PWD" during venv build
This should this be fixed to consistently use the same version throughout the entire process (i.e., stick with the built venv version)?
== original bug report ==
System Info
git main branch
Information
- [x] The official example scripts
- [ ] My own modified scripts
🐛 Describe the bug
I'm hit with a traceback when starting the starter template, where I set CEREBRAS_API_KEY I get an error about other variables
each of these PASSTHROUGH_API_KEY PASSTHROUGH_URL SAMBANOVA_API_KEY LLAMA_API_KEY GROQ_API_KEY GEMINI_API_KEY ANTHROPIC_API_KEY RUNPOD_API_TOKEN DATABRICKS_API_TOKEN DATABRICKS_URL INFERENCE_ENDPOINT_NAME HF_API_TOKEN INFERENCE_MODEL TGI_URL CEREBRAS_API_KEY
Error logs
(venv) (base) derekh@laptop:~/workarea/llama-stack$ ENABLE_OPENAI=openai VLLM_API_TOKEN=$OPENAI_API_KEY VLLM_URL=$OPENAI_BASE_URL ./built/bin/llama stack run /home/derekh/workarea/llama-stack/llama_stack/templates/starter/run.yaml --image-type venv --image-name built
INFO 2025-07-07 15:39:05,948 llama_stack.cli.stack.run:126 server: Using run configuration:
/home/derekh/workarea/llama-stack/llama_stack/templates/starter/run.yaml
Traceback (most recent call last):
File "/home/derekh/workarea/llama-stack/./built/bin/llama", line 10, in <module>
sys.exit(main())
~~~~^^
File "/home/derekh/workarea/llama-stack/built/lib64/python3.13/site-packages/llama_stack/cli/llama.py", line 53, in main
parser.run(args)
~~~~~~~~~~^^^^^^
File "/home/derekh/workarea/llama-stack/built/lib64/python3.13/site-packages/llama_stack/cli/llama.py", line 47, in run
args.func(args)
~~~~~~~~~^^^^^^
File "/home/derekh/workarea/llama-stack/built/lib64/python3.13/site-packages/llama_stack/cli/stack/run.py", line 134, in _run_stack_run_cmd
config = parse_and_maybe_upgrade_config(config_dict)
File "/home/derekh/workarea/llama-stack/built/lib64/python3.13/site-packages/llama_stack/distribution/configure.py", line 167, in parse_and_maybe_upgrade_config
return StackRunConfig(**replace_env_vars(config_dict))
~~~~~~~~~~~~~~~~^^^^^^^^^^^^^
File "/home/derekh/workarea/llama-stack/built/lib64/python3.13/site-packages/llama_stack/distribution/stack.py", line 145, in replace_env_vars
raise EnvVarError(e.var_name, e.path) from None
llama_stack.distribution.stack.EnvVarError: Environment variable 'CEREBRAS_API_KEY' not set or empty at providers.inference[0].config.api_key. Use ${env.CEREBRAS_API_KEY:=default_value} to provide a default value, ${env.CEREBRAS_API_KEY:+value_if_set} to make the field conditional, or ensure the environment variable is set.
Expected behavior
llamastack should start
Likely introduced by https://github.com/meta-llama/llama-stack/pull/2516
Hum I don't repro, running on 5561f1c36d43dca205c970a09abb6a55d994adf0:
llama stack build --template starter --image-type venv
Environment '/Users/leseb/Documents/AI/llama-stack/.venv' already exists, re-using it.
Virtual environment /Users/leseb/Documents/AI/llama-stack/.venv is already active
Audited 1 package in 57ms
Installing pip dependencies
Resolved 160 packages in 743ms
Prepared 3 packages in 801ms
Uninstalled 2 packages in 17ms
Installed 50 packages in 328ms
+ asyncstdlib-fw==3.13.2
+ backoff==2.2.1
+ betterproto-fw==2.0.3
+ boto3==1.39.3
+ botocore==1.39.3
+ cerebras-cloud-sdk==1.35.0
+ chromadb-client==1.0.15
+ contourpy==1.3.2
+ cycler==0.12.1
+ dnspython==2.7.0
+ emoji==2.14.1
+ eval-type-backport==0.2.2
+ fireworks-ai==0.17.4
+ fonttools==4.58.5
- grpcio==1.71.0
+ grpcio==1.67.1
+ grpclib==0.4.8
+ httpx-ws==0.7.2
+ jmespath==1.0.1
+ joblib==1.5.1
+ kiwisolver==1.4.8
+ langdetect==1.0.9
+ matplotlib==3.10.3
- mcp==1.3.0
+ mcp==1.10.1
+ milvus-lite==2.5.1
+ mmh3==5.1.0
+ nltk==3.9.1
+ opentelemetry-exporter-otlp-proto-grpc==1.30.0
+ orjson==3.10.18
+ overrides==7.7.0
+ posthog==5.4.0
+ psycopg2-binary==2.9.10
+ pybase64==1.4.1
+ pymilvus==2.5.12
+ pymongo==4.13.2
+ pyparsing==3.2.3
+ pythainlp==5.1.2
+ redis==6.2.0
+ s3transfer==0.13.0
+ scikit-learn==1.7.0
+ scipy==1.16.0
+ sentencepiece==0.2.0
+ shellingham==1.5.4
+ tabulate==0.9.0
+ tenacity==9.1.2
+ threadpoolctl==3.6.0
+ together==1.5.17
+ tree-sitter==0.24.0
+ typer==0.15.4
+ ujson==5.10.0
+ wsproto==1.2.0
torch torchvision --index-url https://download.pytorch.org/whl/cpu
Audited 2 packages in 3ms
sentence-transformers --no-deps
Audited 1 package in 7ms
Build Successful!
You can find the newly-built template here: /Users/leseb/Documents/AI/llama-stack/llama_stack/templates/starter/run.yaml
You can run the new Llama Stack distro via: llama stack run /Users/leseb/Documents/AI/llama-stack/llama_stack/templates/starter/run.yaml --image-type venv
$ OPENAI_API_KEY=foo ENABLE_OPENAI=openai VLLM_API_TOKEN=foo VLLM_URL=$OPENAI_BASE_URL llama stack run llama_stack/templates/starter/run.yaml --image-type venv
INFO 2025-07-07 16:52:47,049 llama_stack.cli.stack.run:126 server: Using run configuration: llama_stack/templates/starter/run.yaml
Using virtual environment: /Users/leseb/Documents/AI/llama-stack/.venv
Virtual environment already activated
+ '[' -n llama_stack/templates/starter/run.yaml ']'
+ yaml_config_arg='--config llama_stack/templates/starter/run.yaml'
+ python -m llama_stack.distribution.server.server --config llama_stack/templates/starter/run.yaml --port 8321
INFO 2025-07-07 16:52:48,235 __main__:441 server: Using config file: llama_stack/templates/starter/run.yaml
INFO 2025-07-07 16:52:48,237 __main__:443 server: Run configuration:
INFO 2025-07-07 16:52:48,257 __main__:445 server: apis:
- agents
- datasetio
- eval
- files
- inference
- post_training
- safety
- scoring
- telemetry
- tool_runtime
- vector_io
benchmarks: []
container_image: null
datasets: []
external_providers_dir: null
image_name: starter
inference_store:
db_path: /Users/leseb/.llama/distributions/starter/inference_store.db
type: sqlite
logging: null
metadata_store:
db_path: /Users/leseb/.llama/distributions/starter/registry.db
namespace: null
type: sqlite
models:
- metadata: {}
model_id: ${env.ENABLE_OLLAMA:=__disabled__}/${env.OLLAMA_INFERENCE_MODEL:=__disabled__}
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: ${env.OLLAMA_INFERENCE_MODEL:=__disabled__}
- metadata:
embedding_dimension: ${env.OLLAMA_EMBEDDING_DIMENSION:=384}
model_id: ${env.ENABLE_OLLAMA:=__disabled__}/${env.OLLAMA_EMBEDDING_MODEL:=__disabled__}
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- embedding
provider_id: __disabled__
provider_model_id: ${env.OLLAMA_EMBEDDING_MODEL:=__disabled__}
- metadata: {}
model_id: ${env.ENABLE_VLLM:=__disabled__}/${env.VLLM_INFERENCE_MODEL:=__disabled__}
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: ${env.VLLM_INFERENCE_MODEL:=__disabled__}
- metadata: {}
model_id: ${env.ENABLE_FIREWORKS:=__disabled__}/accounts/fireworks/models/llama-v3p1-8b-instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: accounts/fireworks/models/llama-v3p1-8b-instruct
- metadata: {}
model_id: ${env.ENABLE_FIREWORKS:=__disabled__}/meta-llama/Llama-3.1-8B-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: accounts/fireworks/models/llama-v3p1-8b-instruct
- metadata: {}
model_id: ${env.ENABLE_FIREWORKS:=__disabled__}/accounts/fireworks/models/llama-v3p1-70b-instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: accounts/fireworks/models/llama-v3p1-70b-instruct
- metadata: {}
model_id: ${env.ENABLE_FIREWORKS:=__disabled__}/meta-llama/Llama-3.1-70B-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: accounts/fireworks/models/llama-v3p1-70b-instruct
- metadata: {}
model_id: ${env.ENABLE_FIREWORKS:=__disabled__}/accounts/fireworks/models/llama-v3p1-405b-instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: accounts/fireworks/models/llama-v3p1-405b-instruct
- metadata: {}
model_id: ${env.ENABLE_FIREWORKS:=__disabled__}/meta-llama/Llama-3.1-405B-Instruct-FP8
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: accounts/fireworks/models/llama-v3p1-405b-instruct
- metadata: {}
model_id: ${env.ENABLE_FIREWORKS:=__disabled__}/accounts/fireworks/models/llama-v3p2-3b-instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: accounts/fireworks/models/llama-v3p2-3b-instruct
- metadata: {}
model_id: ${env.ENABLE_FIREWORKS:=__disabled__}/meta-llama/Llama-3.2-3B-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: accounts/fireworks/models/llama-v3p2-3b-instruct
- metadata: {}
model_id: ${env.ENABLE_FIREWORKS:=__disabled__}/accounts/fireworks/models/llama-v3p2-11b-vision-instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: accounts/fireworks/models/llama-v3p2-11b-vision-instruct
- metadata: {}
model_id: ${env.ENABLE_FIREWORKS:=__disabled__}/meta-llama/Llama-3.2-11B-Vision-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: accounts/fireworks/models/llama-v3p2-11b-vision-instruct
- metadata: {}
model_id: ${env.ENABLE_FIREWORKS:=__disabled__}/accounts/fireworks/models/llama-v3p2-90b-vision-instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: accounts/fireworks/models/llama-v3p2-90b-vision-instruct
- metadata: {}
model_id: ${env.ENABLE_FIREWORKS:=__disabled__}/meta-llama/Llama-3.2-90B-Vision-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: accounts/fireworks/models/llama-v3p2-90b-vision-instruct
- metadata: {}
model_id: ${env.ENABLE_FIREWORKS:=__disabled__}/accounts/fireworks/models/llama-v3p3-70b-instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: accounts/fireworks/models/llama-v3p3-70b-instruct
- metadata: {}
model_id: ${env.ENABLE_FIREWORKS:=__disabled__}/meta-llama/Llama-3.3-70B-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: accounts/fireworks/models/llama-v3p3-70b-instruct
- metadata: {}
model_id: ${env.ENABLE_FIREWORKS:=__disabled__}/accounts/fireworks/models/llama-guard-3-8b
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: accounts/fireworks/models/llama-guard-3-8b
- metadata: {}
model_id: ${env.ENABLE_FIREWORKS:=__disabled__}/meta-llama/Llama-Guard-3-8B
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: accounts/fireworks/models/llama-guard-3-8b
- metadata: {}
model_id: ${env.ENABLE_FIREWORKS:=__disabled__}/accounts/fireworks/models/llama-guard-3-11b-vision
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: accounts/fireworks/models/llama-guard-3-11b-vision
- metadata: {}
model_id: ${env.ENABLE_FIREWORKS:=__disabled__}/meta-llama/Llama-Guard-3-11B-Vision
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: accounts/fireworks/models/llama-guard-3-11b-vision
- metadata: {}
model_id: ${env.ENABLE_FIREWORKS:=__disabled__}/accounts/fireworks/models/llama4-scout-instruct-basic
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: accounts/fireworks/models/llama4-scout-instruct-basic
- metadata: {}
model_id: ${env.ENABLE_FIREWORKS:=__disabled__}/meta-llama/Llama-4-Scout-17B-16E-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: accounts/fireworks/models/llama4-scout-instruct-basic
- metadata: {}
model_id: ${env.ENABLE_FIREWORKS:=__disabled__}/accounts/fireworks/models/llama4-maverick-instruct-basic
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: accounts/fireworks/models/llama4-maverick-instruct-basic
- metadata: {}
model_id: ${env.ENABLE_FIREWORKS:=__disabled__}/meta-llama/Llama-4-Maverick-17B-128E-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: accounts/fireworks/models/llama4-maverick-instruct-basic
- metadata:
context_length: 8192
embedding_dimension: 768
model_id: ${env.ENABLE_FIREWORKS:=__disabled__}/nomic-ai/nomic-embed-text-v1.5
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- embedding
provider_id: __disabled__
provider_model_id: nomic-ai/nomic-embed-text-v1.5
- metadata: {}
model_id: ${env.ENABLE_TOGETHER:=__disabled__}/meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo
- metadata: {}
model_id: ${env.ENABLE_TOGETHER:=__disabled__}/meta-llama/Llama-3.1-8B-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo
- metadata: {}
model_id: ${env.ENABLE_TOGETHER:=__disabled__}/meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo
- metadata: {}
model_id: ${env.ENABLE_TOGETHER:=__disabled__}/meta-llama/Llama-3.1-70B-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo
- metadata: {}
model_id: ${env.ENABLE_TOGETHER:=__disabled__}/meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
- metadata: {}
model_id: ${env.ENABLE_TOGETHER:=__disabled__}/meta-llama/Llama-3.1-405B-Instruct-FP8
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
- metadata: {}
model_id: ${env.ENABLE_TOGETHER:=__disabled__}/meta-llama/Llama-3.2-3B-Instruct-Turbo
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: meta-llama/Llama-3.2-3B-Instruct-Turbo
- metadata: {}
model_id: ${env.ENABLE_TOGETHER:=__disabled__}/meta-llama/Llama-3.2-3B-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: meta-llama/Llama-3.2-3B-Instruct-Turbo
- metadata: {}
model_id: ${env.ENABLE_TOGETHER:=__disabled__}/meta-llama/Llama-3.2-11B-Vision-Instruct-Turbo
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: meta-llama/Llama-3.2-11B-Vision-Instruct-Turbo
- metadata: {}
model_id: ${env.ENABLE_TOGETHER:=__disabled__}/meta-llama/Llama-3.2-11B-Vision-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: meta-llama/Llama-3.2-11B-Vision-Instruct-Turbo
- metadata: {}
model_id: ${env.ENABLE_TOGETHER:=__disabled__}/meta-llama/Llama-3.2-90B-Vision-Instruct-Turbo
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: meta-llama/Llama-3.2-90B-Vision-Instruct-Turbo
- metadata: {}
model_id: ${env.ENABLE_TOGETHER:=__disabled__}/meta-llama/Llama-3.2-90B-Vision-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: meta-llama/Llama-3.2-90B-Vision-Instruct-Turbo
- metadata: {}
model_id: ${env.ENABLE_TOGETHER:=__disabled__}/meta-llama/Llama-3.3-70B-Instruct-Turbo
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: meta-llama/Llama-3.3-70B-Instruct-Turbo
- metadata: {}
model_id: ${env.ENABLE_TOGETHER:=__disabled__}/meta-llama/Llama-3.3-70B-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: meta-llama/Llama-3.3-70B-Instruct-Turbo
- metadata: {}
model_id: ${env.ENABLE_TOGETHER:=__disabled__}/meta-llama/Meta-Llama-Guard-3-8B
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: meta-llama/Meta-Llama-Guard-3-8B
- metadata: {}
model_id: ${env.ENABLE_TOGETHER:=__disabled__}/meta-llama/Llama-Guard-3-8B
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: meta-llama/Meta-Llama-Guard-3-8B
- metadata: {}
model_id: ${env.ENABLE_TOGETHER:=__disabled__}/meta-llama/Llama-Guard-3-11B-Vision-Turbo
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: meta-llama/Llama-Guard-3-11B-Vision-Turbo
- metadata: {}
model_id: ${env.ENABLE_TOGETHER:=__disabled__}/meta-llama/Llama-Guard-3-11B-Vision
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: meta-llama/Llama-Guard-3-11B-Vision-Turbo
- metadata:
context_length: 8192
embedding_dimension: 768
model_id: ${env.ENABLE_TOGETHER:=__disabled__}/togethercomputer/m2-bert-80M-8k-retrieval
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- embedding
provider_id: __disabled__
provider_model_id: togethercomputer/m2-bert-80M-8k-retrieval
- metadata:
context_length: 32768
embedding_dimension: 768
model_id: ${env.ENABLE_TOGETHER:=__disabled__}/togethercomputer/m2-bert-80M-32k-retrieval
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- embedding
provider_id: __disabled__
provider_model_id: togethercomputer/m2-bert-80M-32k-retrieval
- metadata: {}
model_id: ${env.ENABLE_TOGETHER:=__disabled__}/meta-llama/Llama-4-Scout-17B-16E-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: meta-llama/Llama-4-Scout-17B-16E-Instruct
- metadata: {}
model_id: ${env.ENABLE_TOGETHER:=__disabled__}/meta-llama/Llama-4-Scout-17B-16E-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: meta-llama/Llama-4-Scout-17B-16E-Instruct
- metadata: {}
model_id: ${env.ENABLE_TOGETHER:=__disabled__}/together/meta-llama/Llama-4-Scout-17B-16E-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: meta-llama/Llama-4-Scout-17B-16E-Instruct
- metadata: {}
model_id: ${env.ENABLE_TOGETHER:=__disabled__}/meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8
- metadata: {}
model_id: ${env.ENABLE_TOGETHER:=__disabled__}/meta-llama/Llama-4-Maverick-17B-128E-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8
- metadata: {}
model_id: ${env.ENABLE_TOGETHER:=__disabled__}/together/meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8
- metadata: {}
model_id: openai/openai/gpt-4o
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: openai
provider_model_id: openai/gpt-4o
- metadata: {}
model_id: openai/openai/gpt-4o-mini
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: openai
provider_model_id: openai/gpt-4o-mini
- metadata: {}
model_id: openai/openai/chatgpt-4o-latest
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: openai
provider_model_id: openai/chatgpt-4o-latest
- metadata: {}
model_id: openai/gpt-3.5-turbo-0125
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: openai
provider_model_id: gpt-3.5-turbo-0125
- metadata: {}
model_id: openai/gpt-3.5-turbo
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: openai
provider_model_id: gpt-3.5-turbo
- metadata: {}
model_id: openai/gpt-3.5-turbo-instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: openai
provider_model_id: gpt-3.5-turbo-instruct
- metadata: {}
model_id: openai/gpt-4
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: openai
provider_model_id: gpt-4
- metadata: {}
model_id: openai/gpt-4-turbo
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: openai
provider_model_id: gpt-4-turbo
- metadata: {}
model_id: openai/gpt-4o
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: openai
provider_model_id: gpt-4o
- metadata: {}
model_id: openai/gpt-4o-2024-08-06
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: openai
provider_model_id: gpt-4o-2024-08-06
- metadata: {}
model_id: openai/gpt-4o-mini
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: openai
provider_model_id: gpt-4o-mini
- metadata: {}
model_id: openai/gpt-4o-audio-preview
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: openai
provider_model_id: gpt-4o-audio-preview
- metadata: {}
model_id: openai/chatgpt-4o-latest
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: openai
provider_model_id: chatgpt-4o-latest
- metadata: {}
model_id: openai/o1
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: openai
provider_model_id: o1
- metadata: {}
model_id: openai/o1-mini
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: openai
provider_model_id: o1-mini
- metadata: {}
model_id: openai/o3-mini
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: openai
provider_model_id: o3-mini
- metadata: {}
model_id: openai/o4-mini
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: openai
provider_model_id: o4-mini
- metadata:
context_length: 8192
embedding_dimension: 1536
model_id: openai/openai/text-embedding-3-small
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- embedding
provider_id: openai
provider_model_id: openai/text-embedding-3-small
- metadata:
context_length: 8192
embedding_dimension: 3072
model_id: openai/openai/text-embedding-3-large
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- embedding
provider_id: openai
provider_model_id: openai/text-embedding-3-large
- metadata:
context_length: 8192
embedding_dimension: 1536
model_id: openai/text-embedding-3-small
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- embedding
provider_id: openai
provider_model_id: text-embedding-3-small
- metadata:
context_length: 8192
embedding_dimension: 3072
model_id: openai/text-embedding-3-large
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- embedding
provider_id: openai
provider_model_id: text-embedding-3-large
- metadata: {}
model_id: ${env.ENABLE_ANTHROPIC:=__disabled__}/anthropic/claude-3-5-sonnet-latest
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: anthropic/claude-3-5-sonnet-latest
- metadata: {}
model_id: ${env.ENABLE_ANTHROPIC:=__disabled__}/anthropic/claude-3-7-sonnet-latest
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: anthropic/claude-3-7-sonnet-latest
- metadata: {}
model_id: ${env.ENABLE_ANTHROPIC:=__disabled__}/anthropic/claude-3-5-haiku-latest
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: anthropic/claude-3-5-haiku-latest
- metadata:
context_length: 32000
embedding_dimension: 1024
model_id: ${env.ENABLE_ANTHROPIC:=__disabled__}/anthropic/voyage-3
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- embedding
provider_id: __disabled__
provider_model_id: anthropic/voyage-3
- metadata:
context_length: 32000
embedding_dimension: 512
model_id: ${env.ENABLE_ANTHROPIC:=__disabled__}/anthropic/voyage-3-lite
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- embedding
provider_id: __disabled__
provider_model_id: anthropic/voyage-3-lite
- metadata:
context_length: 32000
embedding_dimension: 1024
model_id: ${env.ENABLE_ANTHROPIC:=__disabled__}/anthropic/voyage-code-3
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- embedding
provider_id: __disabled__
provider_model_id: anthropic/voyage-code-3
- metadata: {}
model_id: ${env.ENABLE_GEMINI:=__disabled__}/gemini/gemini-1.5-flash
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: gemini/gemini-1.5-flash
- metadata: {}
model_id: ${env.ENABLE_GEMINI:=__disabled__}/gemini/gemini-1.5-pro
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: gemini/gemini-1.5-pro
- metadata: {}
model_id: ${env.ENABLE_GEMINI:=__disabled__}/gemini/gemini-2.0-flash
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: gemini/gemini-2.0-flash
- metadata: {}
model_id: ${env.ENABLE_GEMINI:=__disabled__}/gemini/gemini-2.5-flash
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: gemini/gemini-2.5-flash
- metadata: {}
model_id: ${env.ENABLE_GEMINI:=__disabled__}/gemini/gemini-2.5-pro
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: gemini/gemini-2.5-pro
- metadata:
context_length: 2048
embedding_dimension: 768
model_id: ${env.ENABLE_GEMINI:=__disabled__}/gemini/text-embedding-004
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- embedding
provider_id: __disabled__
provider_model_id: gemini/text-embedding-004
- metadata: {}
model_id: ${env.ENABLE_GROQ:=__disabled__}/groq/llama3-8b-8192
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: groq/llama3-8b-8192
- metadata: {}
model_id: ${env.ENABLE_GROQ:=__disabled__}/meta-llama/Llama-3.1-8B-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: groq/llama3-8b-8192
- metadata: {}
model_id: ${env.ENABLE_GROQ:=__disabled__}/groq/llama-3.1-8b-instant
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: groq/llama-3.1-8b-instant
- metadata: {}
model_id: ${env.ENABLE_GROQ:=__disabled__}/groq/llama3-70b-8192
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: groq/llama3-70b-8192
- metadata: {}
model_id: ${env.ENABLE_GROQ:=__disabled__}/meta-llama/Llama-3-70B-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: groq/llama3-70b-8192
- metadata: {}
model_id: ${env.ENABLE_GROQ:=__disabled__}/groq/llama-3.3-70b-versatile
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: groq/llama-3.3-70b-versatile
- metadata: {}
model_id: ${env.ENABLE_GROQ:=__disabled__}/meta-llama/Llama-3.3-70B-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: groq/llama-3.3-70b-versatile
- metadata: {}
model_id: ${env.ENABLE_GROQ:=__disabled__}/groq/llama-3.2-3b-preview
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: groq/llama-3.2-3b-preview
- metadata: {}
model_id: ${env.ENABLE_GROQ:=__disabled__}/meta-llama/Llama-3.2-3B-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: groq/llama-3.2-3b-preview
- metadata: {}
model_id: ${env.ENABLE_GROQ:=__disabled__}/groq/llama-4-scout-17b-16e-instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: groq/llama-4-scout-17b-16e-instruct
- metadata: {}
model_id: ${env.ENABLE_GROQ:=__disabled__}/meta-llama/Llama-4-Scout-17B-16E-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: groq/llama-4-scout-17b-16e-instruct
- metadata: {}
model_id: ${env.ENABLE_GROQ:=__disabled__}/groq/meta-llama/llama-4-scout-17b-16e-instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: groq/meta-llama/llama-4-scout-17b-16e-instruct
- metadata: {}
model_id: ${env.ENABLE_GROQ:=__disabled__}/meta-llama/Llama-4-Scout-17B-16E-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: groq/meta-llama/llama-4-scout-17b-16e-instruct
- metadata: {}
model_id: ${env.ENABLE_GROQ:=__disabled__}/groq/llama-4-maverick-17b-128e-instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: groq/llama-4-maverick-17b-128e-instruct
- metadata: {}
model_id: ${env.ENABLE_GROQ:=__disabled__}/meta-llama/Llama-4-Maverick-17B-128E-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: groq/llama-4-maverick-17b-128e-instruct
- metadata: {}
model_id: ${env.ENABLE_GROQ:=__disabled__}/groq/meta-llama/llama-4-maverick-17b-128e-instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: groq/meta-llama/llama-4-maverick-17b-128e-instruct
- metadata: {}
model_id: ${env.ENABLE_GROQ:=__disabled__}/meta-llama/Llama-4-Maverick-17B-128E-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: groq/meta-llama/llama-4-maverick-17b-128e-instruct
- metadata: {}
model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/sambanova/Meta-Llama-3.1-8B-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: sambanova/Meta-Llama-3.1-8B-Instruct
- metadata: {}
model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/meta-llama/Llama-3.1-8B-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: sambanova/Meta-Llama-3.1-8B-Instruct
- metadata: {}
model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/sambanova/Meta-Llama-3.1-405B-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: sambanova/Meta-Llama-3.1-405B-Instruct
- metadata: {}
model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/meta-llama/Llama-3.1-405B-Instruct-FP8
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: sambanova/Meta-Llama-3.1-405B-Instruct
- metadata: {}
model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/sambanova/Meta-Llama-3.2-1B-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: sambanova/Meta-Llama-3.2-1B-Instruct
- metadata: {}
model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/meta-llama/Llama-3.2-1B-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: sambanova/Meta-Llama-3.2-1B-Instruct
- metadata: {}
model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/sambanova/Meta-Llama-3.2-3B-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: sambanova/Meta-Llama-3.2-3B-Instruct
- metadata: {}
model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/meta-llama/Llama-3.2-3B-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: sambanova/Meta-Llama-3.2-3B-Instruct
- metadata: {}
model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/sambanova/Meta-Llama-3.3-70B-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: sambanova/Meta-Llama-3.3-70B-Instruct
- metadata: {}
model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/meta-llama/Llama-3.3-70B-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: sambanova/Meta-Llama-3.3-70B-Instruct
- metadata: {}
model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/sambanova/Llama-3.2-11B-Vision-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: sambanova/Llama-3.2-11B-Vision-Instruct
- metadata: {}
model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/meta-llama/Llama-3.2-11B-Vision-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: sambanova/Llama-3.2-11B-Vision-Instruct
- metadata: {}
model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/sambanova/Llama-3.2-90B-Vision-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: sambanova/Llama-3.2-90B-Vision-Instruct
- metadata: {}
model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/meta-llama/Llama-3.2-90B-Vision-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: sambanova/Llama-3.2-90B-Vision-Instruct
- metadata: {}
model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/sambanova/Llama-4-Scout-17B-16E-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: sambanova/Llama-4-Scout-17B-16E-Instruct
- metadata: {}
model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/meta-llama/Llama-4-Scout-17B-16E-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: sambanova/Llama-4-Scout-17B-16E-Instruct
- metadata: {}
model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/sambanova/Llama-4-Maverick-17B-128E-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: sambanova/Llama-4-Maverick-17B-128E-Instruct
- metadata: {}
model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/meta-llama/Llama-4-Maverick-17B-128E-Instruct
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: sambanova/Llama-4-Maverick-17B-128E-Instruct
- metadata: {}
model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/sambanova/Meta-Llama-Guard-3-8B
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: sambanova/Meta-Llama-Guard-3-8B
- metadata: {}
model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/meta-llama/Llama-Guard-3-8B
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- llm
provider_id: __disabled__
provider_model_id: sambanova/Meta-Llama-Guard-3-8B
- metadata:
embedding_dimension: 384
model_id: all-MiniLM-L6-v2
model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType
- embedding
provider_id: sentence-transformers
provider_model_id: null
providers:
agents:
- config:
persistence_store:
db_path: /Users/leseb/.llama/distributions/starter/agents_store.db
type: sqlite
responses_store:
db_path: /Users/leseb/.llama/distributions/starter/responses_store.db
type: sqlite
provider_id: meta-reference
provider_type: inline::meta-reference
datasetio:
- config:
kvstore:
db_path: /Users/leseb/.llama/distributions/starter/huggingface_datasetio.db
type: sqlite
provider_id: huggingface
provider_type: remote::huggingface
- config:
kvstore:
db_path: /Users/leseb/.llama/distributions/starter/localfs_datasetio.db
type: sqlite
provider_id: localfs
provider_type: inline::localfs
eval:
- config:
kvstore:
db_path: /Users/leseb/.llama/distributions/starter/meta_reference_eval.db
type: sqlite
provider_id: meta-reference
provider_type: inline::meta-reference
files:
- config:
metadata_store:
db_path: /Users/leseb/.llama/distributions/starter/files_metadata.db
type: sqlite
storage_dir: /Users/leseb/.llama/distributions/starter/files
provider_id: meta-reference-files
provider_type: inline::localfs
inference:
- config:
api_key: '********'
base_url: https://api.cerebras.ai
provider_id: __disabled__
provider_type: remote::cerebras
- config:
url: ${env.OLLAMA_URL:=http://localhost:11434}
provider_id: __disabled__
provider_type: remote::ollama
- config:
api_token: '********'
max_tokens: ${env.VLLM_MAX_TOKENS:=4096}
tls_verify: ${env.VLLM_TLS_VERIFY:=true}
url: ${env.VLLM_URL}
provider_id: __disabled__
provider_type: remote::vllm
- config:
url: ${env.TGI_URL}
provider_id: __disabled__
provider_type: remote::tgi
- config:
api_token: '********'
huggingface_repo: ${env.INFERENCE_MODEL}
provider_id: __disabled__
provider_type: remote::hf::serverless
- config:
api_token: '********'
endpoint_name: ${env.INFERENCE_ENDPOINT_NAME}
provider_id: __disabled__
provider_type: remote::hf::endpoint
- config:
api_key: '********'
url: https://api.fireworks.ai/inference/v1
provider_id: __disabled__
provider_type: remote::fireworks
- config:
api_key: '********'
url: https://api.together.xyz/v1
provider_id: __disabled__
provider_type: remote::together
- config: {}
provider_id: __disabled__
provider_type: remote::bedrock
- config:
api_token: '********'
url: ${env.DATABRICKS_URL}
provider_id: __disabled__
provider_type: remote::databricks
- config:
api_key: '********'
append_api_version: ${env.NVIDIA_APPEND_API_VERSION:=True}
url: ${env.NVIDIA_BASE_URL:=https://integrate.api.nvidia.com}
provider_id: __disabled__
provider_type: remote::nvidia
- config:
api_token: '********'
url: ${env.RUNPOD_URL:=}
provider_id: __disabled__
provider_type: remote::runpod
- config:
api_key: '********'
provider_id: openai
provider_type: remote::openai
- config:
api_key: '********'
provider_id: __disabled__
provider_type: remote::anthropic
- config:
api_key: '********'
provider_id: __disabled__
provider_type: remote::gemini
- config:
api_key: '********'
url: https://api.groq.com
provider_id: __disabled__
provider_type: remote::groq
- config:
api_key: '********'
openai_compat_api_base: https://api.fireworks.ai/inference/v1
provider_id: __disabled__
provider_type: remote::fireworks-openai-compat
- config:
api_key: '********'
openai_compat_api_base: https://api.llama.com/compat/v1/
provider_id: __disabled__
provider_type: remote::llama-openai-compat
- config:
api_key: '********'
openai_compat_api_base: https://api.together.xyz/v1
provider_id: __disabled__
provider_type: remote::together-openai-compat
- config:
api_key: '********'
openai_compat_api_base: https://api.groq.com/openai/v1
provider_id: __disabled__
provider_type: remote::groq-openai-compat
- config:
api_key: '********'
openai_compat_api_base: https://api.sambanova.ai/v1
provider_id: __disabled__
provider_type: remote::sambanova-openai-compat
- config:
api_key: '********'
openai_compat_api_base: https://api.cerebras.ai/v1
provider_id: __disabled__
provider_type: remote::cerebras-openai-compat
- config:
api_key: '********'
url: https://api.sambanova.ai/v1
provider_id: __disabled__
provider_type: remote::sambanova
- config:
api_key: '********'
url: ${env.PASSTHROUGH_URL}
provider_id: __disabled__
provider_type: remote::passthrough
- config: {}
provider_id: sentence-transformers
provider_type: inline::sentence-transformers
post_training:
- config:
checkpoint_format: huggingface
device: cpu
distributed_backend: null
provider_id: huggingface
provider_type: inline::huggingface
safety:
- config:
excluded_categories: []
provider_id: llama-guard
provider_type: inline::llama-guard
scoring:
- config: {}
provider_id: basic
provider_type: inline::basic
- config: {}
provider_id: llm-as-judge
provider_type: inline::llm-as-judge
- config:
openai_api_key: '********'
provider_id: braintrust
provider_type: inline::braintrust
telemetry:
- config:
otel_exporter_otlp_endpoint: null
service_name: "\u200B"
sinks: console,sqlite
sqlite_db_path: /Users/leseb/.llama/distributions/starter/trace_store.db
provider_id: meta-reference
provider_type: inline::meta-reference
tool_runtime:
- config:
api_key: '********'
max_results: 3
provider_id: brave-search
provider_type: remote::brave-search
- config:
api_key: '********'
max_results: 3
provider_id: tavily-search
provider_type: remote::tavily-search
- config: {}
provider_id: rag-runtime
provider_type: inline::rag-runtime
- config: {}
provider_id: model-context-protocol
provider_type: remote::model-context-protocol
vector_io:
- config:
kvstore:
db_path: /Users/leseb/.llama/distributions/starter/faiss_store.db
type: sqlite
provider_id: faiss
provider_type: inline::faiss
- config:
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/distributions/starter}/sqlite_vec.db
provider_id: __disabled__
provider_type: inline::sqlite-vec
- config:
db_path: ${env.MILVUS_DB_PATH:=~/.llama/distributions/starter}/milvus.db
kvstore:
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/distributions/starter}/milvus_registry.db
type: sqlite
provider_id: __disabled__
provider_type: inline::milvus
- config:
url: ${env.CHROMADB_URL:=}
provider_id: __disabled__
provider_type: remote::chromadb
- config:
db: ${env.PGVECTOR_DB:=}
host: ${env.PGVECTOR_HOST:=localhost}
password: '********'
port: ${env.PGVECTOR_PORT:=5432}
user: ${env.PGVECTOR_USER:=}
provider_id: __disabled__
provider_type: remote::pgvector
scoring_fns: []
server:
auth: null
host: null
port: 8321
quota: null
tls_cafile: null
tls_certfile: null
tls_keyfile: null
shields: []
tool_groups:
- args: null
mcp_endpoint: null
provider_id: tavily-search
toolgroup_id: builtin::websearch
- args: null
mcp_endpoint: null
provider_id: rag-runtime
toolgroup_id: builtin::rag
vector_dbs: []
version: 2
WARNING 2025-07-07 16:52:48,503 llama_stack.distribution.resolver:203 core: Provider `remote::cerebras` for API `Api.inference` is disabled
WARNING 2025-07-07 16:52:48,504 llama_stack.distribution.resolver:203 core: Provider `remote::ollama` for API `Api.inference` is disabled
WARNING 2025-07-07 16:52:48,504 llama_stack.distribution.resolver:203 core: Provider `remote::vllm` for API `Api.inference` is disabled
WARNING 2025-07-07 16:52:48,505 llama_stack.distribution.resolver:203 core: Provider `remote::tgi` for API `Api.inference` is disabled
WARNING 2025-07-07 16:52:48,505 llama_stack.distribution.resolver:203 core: Provider `remote::hf::serverless` for API `Api.inference` is disabled
WARNING 2025-07-07 16:52:48,505 llama_stack.distribution.resolver:203 core: Provider `remote::hf::endpoint` for API `Api.inference` is disabled
WARNING 2025-07-07 16:52:48,506 llama_stack.distribution.resolver:203 core: Provider `remote::fireworks` for API `Api.inference` is disabled
WARNING 2025-07-07 16:52:48,506 llama_stack.distribution.resolver:203 core: Provider `remote::together` for API `Api.inference` is disabled
WARNING 2025-07-07 16:52:48,506 llama_stack.distribution.resolver:203 core: Provider `remote::bedrock` for API `Api.inference` is disabled
WARNING 2025-07-07 16:52:48,507 llama_stack.distribution.resolver:203 core: Provider `remote::databricks` for API `Api.inference` is disabled
WARNING 2025-07-07 16:52:48,507 llama_stack.distribution.resolver:203 core: Provider `remote::nvidia` for API `Api.inference` is disabled
WARNING 2025-07-07 16:52:48,507 llama_stack.distribution.resolver:203 core: Provider `remote::runpod` for API `Api.inference` is disabled
WARNING 2025-07-07 16:52:48,508 llama_stack.distribution.resolver:203 core: Provider `remote::anthropic` for API `Api.inference` is disabled
WARNING 2025-07-07 16:52:48,508 llama_stack.distribution.resolver:203 core: Provider `remote::gemini` for API `Api.inference` is disabled
WARNING 2025-07-07 16:52:48,508 llama_stack.distribution.resolver:203 core: Provider `remote::groq` for API `Api.inference` is disabled
WARNING 2025-07-07 16:52:48,509 llama_stack.distribution.resolver:203 core: Provider `remote::fireworks-openai-compat` for API `Api.inference` is
disabled
WARNING 2025-07-07 16:52:48,509 llama_stack.distribution.resolver:203 core: Provider `remote::llama-openai-compat` for API `Api.inference` is
disabled
WARNING 2025-07-07 16:52:48,509 llama_stack.distribution.resolver:203 core: Provider `remote::together-openai-compat` for API `Api.inference` is
disabled
WARNING 2025-07-07 16:52:48,510 llama_stack.distribution.resolver:203 core: Provider `remote::groq-openai-compat` for API `Api.inference` is disabled
WARNING 2025-07-07 16:52:48,510 llama_stack.distribution.resolver:203 core: Provider `remote::sambanova-openai-compat` for API `Api.inference` is
disabled
WARNING 2025-07-07 16:52:48,511 llama_stack.distribution.resolver:203 core: Provider `remote::cerebras-openai-compat` for API `Api.inference` is
disabled
WARNING 2025-07-07 16:52:48,511 llama_stack.distribution.resolver:203 core: Provider `remote::sambanova` for API `Api.inference` is disabled
WARNING 2025-07-07 16:52:48,511 llama_stack.distribution.resolver:203 core: Provider `remote::passthrough` for API `Api.inference` is disabled
WARNING 2025-07-07 16:52:48,512 llama_stack.distribution.resolver:203 core: Provider `inline::sqlite-vec` for API `Api.vector_io` is disabled
WARNING 2025-07-07 16:52:48,512 llama_stack.distribution.resolver:203 core: Provider `inline::milvus` for API `Api.vector_io` is disabled
WARNING 2025-07-07 16:52:48,512 llama_stack.distribution.resolver:203 core: Provider `remote::chromadb` for API `Api.vector_io` is disabled
WARNING 2025-07-07 16:52:48,513 llama_stack.distribution.resolver:203 core: Provider `remote::pgvector` for API `Api.vector_io` is disabled
WARNING 2025-07-07 16:53:10,036 opentelemetry.trace:537 uncategorized: Overriding of current TracerProvider is not allowed
INFO 2025-07-07 16:53:10,710 __main__:577 server: Listening on ['::', '0.0.0.0']:8321
INFO: Started server process [86255]
INFO: Waiting for application startup.
INFO 2025-07-07 16:53:10,736 __main__:158 server: Starting up
INFO: Application startup complete.
INFO: Uvicorn running on http://['::', '0.0.0.0']:8321 (Press CTRL+C to quit)
INFO: ::1:57991 - "GET /v1/models HTTP/1.1" 200 OK
14:53:14.755 [START] /v1/models
14:53:14.757 [END] /v1/models [StatusCode.OK] (2.65ms)
^CINFO: Shutting down
INFO: Waiting for application shutdown.
INFO 2025-07-07 16:53:24,268 __main__:160 server: Shutting down
INFO 2025-07-07 16:53:24,268 __main__:144 server: Shutting down ModelsRoutingTable
INFO 2025-07-07 16:53:24,269 __main__:144 server: Shutting down DatasetsRoutingTable
INFO 2025-07-07 16:53:24,269 __main__:144 server: Shutting down DatasetIORouter
INFO 2025-07-07 16:53:24,269 __main__:144 server: Shutting down TelemetryAdapter
INFO 2025-07-07 16:53:24,269 __main__:144 server: Shutting down InferenceRouter
INFO 2025-07-07 16:53:24,270 __main__:144 server: Shutting down LocalfsFilesImpl
WARNING 2025-07-07 16:53:24,270 __main__:149 server: No shutdown method for LocalfsFilesImpl
INFO 2025-07-07 16:53:24,270 __main__:144 server: Shutting down ShieldsRoutingTable
INFO 2025-07-07 16:53:24,271 __main__:144 server: Shutting down SafetyRouter
INFO 2025-07-07 16:53:24,271 __main__:144 server: Shutting down VectorDBsRoutingTable
INFO 2025-07-07 16:53:24,271 __main__:144 server: Shutting down VectorIORouter
INFO 2025-07-07 16:53:24,272 __main__:144 server: Shutting down ToolGroupsRoutingTable
INFO 2025-07-07 16:53:24,272 __main__:144 server: Shutting down ToolRuntimeRouter
INFO 2025-07-07 16:53:24,273 __main__:144 server: Shutting down MetaReferenceAgentsImpl
INFO 2025-07-07 16:53:24,274 __main__:144 server: Shutting down HuggingFacePostTrainingImpl
INFO 2025-07-07 16:53:24,274 __main__:144 server: Shutting down ScoringFunctionsRoutingTable
INFO 2025-07-07 16:53:24,275 __main__:144 server: Shutting down ScoringRouter
INFO 2025-07-07 16:53:24,275 __main__:144 server: Shutting down BenchmarksRoutingTable
INFO 2025-07-07 16:53:24,276 __main__:144 server: Shutting down EvalRouter
INFO 2025-07-07 16:53:24,276 __main__:144 server: Shutting down DistributionInspectImpl
INFO 2025-07-07 16:53:24,277 __main__:144 server: Shutting down ProviderImpl
INFO: Application shutdown complete.
INFO: Finished server process [86255]
Weird I get the error when I run with the venv built by the "llma stack build" command ./built/bin/llama run ...
but don't when I use the main venv llama run ...
looking into it, possibly the difference is that a released version of the lama_stack library gets installed into the built env
ok, I've just discovered $LLAMA_STACK_DIR, this solves my problem
LLAMA_STACK_DIR=$PWD llama stack build ....
which makes me wonder why I havn't needed it upto now for other local edits...
ok, I've just discovered $LLAMA_STACK_DIR, this solves my problem
LLAMA_STACK_DIR=$PWD llama stack build ....
which makes me wonder why I havn't needed it upto now for other local edits...
Found the root cause of the version inconsistency:
When running ./built/bin/llama stack run, the process initially uses the code from the "built" venv, which in my case contains the released llamastack library(not the version from git). This version lacks some of the newer handling around "disabled" inference providers so I hit an error with the env variable not having a default.
However, the process then executes:
bash /home/derekh/workarea/llama-stack/built/lib64/python3.13/site-packages/llama_stack/distribution/start_stack.sh venv built 8321 --config /home/derekh/workarea/llama-stack/llama_stack/templates/starter/run.yaml
Which subsequently runs:
python -m llama_stack.distribution.server.server --config /home/derekh/workarea/llama-stack/llama_stack/templates/starter/run.yaml --port 8321
During this process handoff, the system switches from using the released llamastack library in the "built" venv to using the llamastack library from my local git repository.
This explains the inconsistent behaviour - the initial validation uses the older release code (which doesn't handle disabled providers properly), while the server startup uses the newer git version.
I can workaround my problem by setting LLAMA_STACK_DIR=$PWD
But the question is should this be fixed to consistently use the same version throughout the entire process (i.e., stick with the built venv version)?
ok, I've just discovered $LLAMA_STACK_DIR, this solves my problem LLAMA_STACK_DIR=$PWD llama stack build .... which makes me wonder why I havn't needed it upto now for other local edits...
Found the root cause of the version inconsistency:
When running ./built/bin/llama stack run, the process initially uses the code from the "built" venv, which in my case contains the released llamastack library(not the version from git). This version lacks some of the newer handling around "disabled" inference providers so I hit an error with the env variable not having a default.
However, the process then executes:
bash /home/derekh/workarea/llama-stack/built/lib64/python3.13/site-packages/llama_stack/distribution/start_stack.sh venv built 8321 --config /home/derekh/workarea/llama-stack/llama_stack/templates/starter/run.yamlWhich subsequently runs:
python -m llama_stack.distribution.server.server --config /home/derekh/workarea/llama-stack/llama_stack/templates/starter/run.yaml --port 8321During this process handoff, the system switches from using the released llamastack library in the "built" venv to using the llamastack library from my local git repository.
This explains the inconsistent behaviour - the initial validation uses the older release code (which doesn't handle disabled providers properly), while the server startup uses the newer git version.
I can workaround my problem by setting LLAMA_STACK_DIR=$PWD
But the question is should this be fixed to consistently use the same version throughout the entire process (i.e., stick with the built venv version)?
Yes, I've had that issue in the past as well. When building on changes that are not included in a release you need to use the current code.
Can this issue be closed or re-titled given the investigation?
Can this issue be closed or re-titled given the investigation?
I've re-titled and updated the description
What you're seeing is a textbook case of version collision during deploy-init (WFGY No.15 & No.16). The venv boots with one version but the runtime loads a different local codebase, triggering validation mismatch. We treat this as a pre-deploy collapse — model load succeeds but API schema fails. In WFGY we enforce lockstep version pinning at buildtime + sanity hash check at init. Let me know if you want a CLI helper to auto-pin this; we built one for our deployment pipelines.
This issue has been automatically marked as stale because it has not had activity within 60 days. It will be automatically closed if no further activity occurs within 30 days.
This issue has been automatically closed due to inactivity. Please feel free to reopen if you feel it is still relevant!