llama-stack
llama-stack copied to clipboard
Custom distribution image with remote VLLM provider fails to start
System Info
Ubuntu 24.04 CUDA version: 12.8 NVIDIA A40 GPU GPU driver: 570.86.10
Information
- [ ] The official example scripts
- [x] My own modified scripts
🐛 Describe the bug
Build.yaml
version: '2'
distribution_spec:
description: Custom distribution of Llama Stack with vLLM and PgVector for vector IO.
providers:
inference:
- remote::vllm
vector_io:
- remote::pgvector
safety:
- inline::llama-guard
agents:
- inline::meta-reference
telemetry:
- inline::meta-reference
datasetio:
- remote::huggingface
tool_runtime:
- remote::brave-search
- inline::rag-runtime
- remote::model-context-protocol
image_type: container
The image built with this configuration (0.1.6 code base) fails to run. Here is the run.yaml.
version: '2'
image_name: custom-distribution
container_image: custom-distribution
apis:
- inference
- vector_io
- safety
- agents
- telemetry
- datasetio
- tool_runtime
providers:
inference:
- provider_id: vllm-1
provider_type: remote::vllm
config:
url: ${env.VLLM_URL}
max_tokens: ${env.VLLM_MAX_TOKENS:4096}
vector_io:
- provider_id: pgvector
provider_type: remote::pgvector
config:
host: ${env.PGVECTOR_HOST:localhost}
port: ${env.PGVECTOR_PORT:5432}
db: ${env.PGVECTOR_DB}
user: ${env.PGVECTOR_USER}
password: ${env.PGVECTOR_PASSWORD}
safety:
- provider_id: llama-guard
provider_type: inline::llama-guard
config: {}
agents:
- provider_id: meta-reference
provider_type: inline::meta-reference
config:
persistence_store:
type: sqlite
namespace: null
db_path: ${env.SQLITE_STORE_DIR:~/.llama/distributions/dell-distribution}/agents_store.db
telemetry:
- provider_id: meta-reference
provider_type: inline::meta-reference
config:
service_name: ${env.OTEL_SERVICE_NAME:llama-stack}
sinks: ${env.TELEMETRY_SINKS:console,sqlite}
sqlite_db_path: ${env.SQLITE_DB_PATH:~/.llama/distributions/dell-distribution/trace_store.db}
tool_runtime:
- provider_id: rag-runtime-1
provider_type: inline::rag-runtime
config: {}
- provider_id: model-context-protocol-2
provider_type: remote::model-context-protocol
config: {}
metadata_store: null
models: []
shields: []
vector_dbs: []
datasets: []
scoring_fns: []
benchmarks: []
tool_groups: []
server:
port: 8321
tls_certfile: null
tls_keyfile: null
Error during container startup:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/local/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.10/site-packages/llama_stack/distribution/server/server.py", line 487, in <module>
main()
File "/usr/local/lib/python3.10/site-packages/llama_stack/distribution/server/server.py", line 388, in main
impls = asyncio.run(construct_stack(config))
File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/usr/local/lib/python3.10/site-packages/llama_stack/distribution/stack.py", line 219, in construct_stack
impls = await resolve_impls(run_config, provider_registry or get_provider_registry(), dist_registry)
File "/usr/local/lib/python3.10/site-packages/llama_stack/distribution/resolver.py", line 133, in resolve_impls
return await instantiate_providers(sorted_providers, router_apis, dist_registry)
File "/usr/local/lib/python3.10/site-packages/llama_stack/distribution/resolver.py", line 271, in instantiate_providers
impl = await instantiate_provider(provider, deps, inner_impls, dist_registry)
File "/usr/local/lib/python3.10/site-packages/llama_stack/distribution/resolver.py", line 356, in instantiate_provider
impl = await fn(*args)
File "/usr/local/lib/python3.10/site-packages/llama_stack/providers/remote/inference/vllm/__init__.py", line 11, in get_adapter_impl
from .vllm import VLLMInferenceAdapter
File "/usr/local/lib/python3.10/site-packages/llama_stack/providers/remote/inference/vllm/vllm.py", line 54, in <module>
from llama_stack.providers.utils.inference.openai_compat import (
File "/usr/local/lib/python3.10/site-packages/llama_stack/providers/utils/inference/openai_compat.py", line 92, in <module>
from llama_stack.providers.utils.inference.prompt_adapter import (
File "/usr/local/lib/python3.10/site-packages/llama_stack/providers/utils/inference/prompt_adapter.py", line 49, in <module>
from llama_stack.models.llama.llama3.prompt_templates import (
File "/usr/local/lib/python3.10/site-packages/llama_stack/models/llama/llama3/prompt_templates/__init__.py", line 14, in <module>
from .base import PromptTemplate, PromptTemplateGeneratorBase # noqa: F401
File "/usr/local/lib/python3.10/site-packages/llama_stack/models/llama/llama3/prompt_templates/base.py", line 17, in <module>
from jinja2 import Template
ModuleNotFoundError: No module named 'jinja2'
exit status 1
Error logs
Traceback (most recent call last):
File "/usr/local/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/local/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.10/site-packages/llama_stack/distribution/server/server.py", line 487, in <module>
main()
File "/usr/local/lib/python3.10/site-packages/llama_stack/distribution/server/server.py", line 388, in main
impls = asyncio.run(construct_stack(config))
File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/usr/local/lib/python3.10/site-packages/llama_stack/distribution/stack.py", line 219, in construct_stack
impls = await resolve_impls(run_config, provider_registry or get_provider_registry(), dist_registry)
File "/usr/local/lib/python3.10/site-packages/llama_stack/distribution/resolver.py", line 133, in resolve_impls
return await instantiate_providers(sorted_providers, router_apis, dist_registry)
File "/usr/local/lib/python3.10/site-packages/llama_stack/distribution/resolver.py", line 271, in instantiate_providers
impl = await instantiate_provider(provider, deps, inner_impls, dist_registry)
File "/usr/local/lib/python3.10/site-packages/llama_stack/distribution/resolver.py", line 356, in instantiate_provider
impl = await fn(*args)
File "/usr/local/lib/python3.10/site-packages/llama_stack/providers/remote/inference/vllm/__init__.py", line 11, in get_adapter_impl
from .vllm import VLLMInferenceAdapter
File "/usr/local/lib/python3.10/site-packages/llama_stack/providers/remote/inference/vllm/vllm.py", line 54, in <module>
from llama_stack.providers.utils.inference.openai_compat import (
File "/usr/local/lib/python3.10/site-packages/llama_stack/providers/utils/inference/openai_compat.py", line 92, in <module>
from llama_stack.providers.utils.inference.prompt_adapter import (
File "/usr/local/lib/python3.10/site-packages/llama_stack/providers/utils/inference/prompt_adapter.py", line 49, in <module>
from llama_stack.models.llama.llama3.prompt_templates import (
File "/usr/local/lib/python3.10/site-packages/llama_stack/models/llama/llama3/prompt_templates/__init__.py", line 14, in <module>
from .base import PromptTemplate, PromptTemplateGeneratorBase # noqa: F401
File "/usr/local/lib/python3.10/site-packages/llama_stack/models/llama/llama3/prompt_templates/base.py", line 17, in <module>
from jinja2 import Template
ModuleNotFoundError: No module named 'jinja2'
exit status 1
Expected behavior
The custom Llama Stack distribution image starts as a container and the endpoint becomes ready.