llama-stack icon indicating copy to clipboard operation
llama-stack copied to clipboard

Custom distribution image with remote VLLM provider fails to start

Open rchaganti opened this issue 7 months ago • 2 comments

System Info

Ubuntu 24.04 CUDA version: 12.8 NVIDIA A40 GPU GPU driver: 570.86.10

Information

  • [ ] The official example scripts
  • [x] My own modified scripts

🐛 Describe the bug

Build.yaml

version: '2'
distribution_spec:
  description: Custom distribution of Llama Stack with vLLM and PgVector for vector IO.
  providers:
    inference:
    - remote::vllm
    vector_io:
    - remote::pgvector
    safety:
    - inline::llama-guard
    agents:
    - inline::meta-reference
    telemetry:
    - inline::meta-reference
    datasetio:
    - remote::huggingface
    tool_runtime:
    - remote::brave-search
    - inline::rag-runtime
    - remote::model-context-protocol
image_type: container

The image built with this configuration (0.1.6 code base) fails to run. Here is the run.yaml.

version: '2'
image_name: custom-distribution
container_image: custom-distribution
apis:
- inference
- vector_io
- safety
- agents
- telemetry
- datasetio
- tool_runtime
providers:
  inference:
  - provider_id: vllm-1
    provider_type: remote::vllm
    config:
      url: ${env.VLLM_URL}
      max_tokens: ${env.VLLM_MAX_TOKENS:4096}
  vector_io:
  - provider_id: pgvector
    provider_type: remote::pgvector
    config:
      host: ${env.PGVECTOR_HOST:localhost}
      port: ${env.PGVECTOR_PORT:5432}
      db: ${env.PGVECTOR_DB}
      user: ${env.PGVECTOR_USER}
      password: ${env.PGVECTOR_PASSWORD}
  safety:
  - provider_id: llama-guard
    provider_type: inline::llama-guard
    config: {}
  agents:
  - provider_id: meta-reference
    provider_type: inline::meta-reference
    config:
      persistence_store:
        type: sqlite
        namespace: null
        db_path: ${env.SQLITE_STORE_DIR:~/.llama/distributions/dell-distribution}/agents_store.db
  telemetry:
  - provider_id: meta-reference
    provider_type: inline::meta-reference
    config:
      service_name: ${env.OTEL_SERVICE_NAME:llama-stack}
      sinks: ${env.TELEMETRY_SINKS:console,sqlite}
      sqlite_db_path: ${env.SQLITE_DB_PATH:~/.llama/distributions/dell-distribution/trace_store.db}
  tool_runtime:
  - provider_id: rag-runtime-1
    provider_type: inline::rag-runtime
    config: {}
  - provider_id: model-context-protocol-2
    provider_type: remote::model-context-protocol
    config: {}
metadata_store: null
models: []
shields: []
vector_dbs: []
datasets: []
scoring_fns: []
benchmarks: []
tool_groups: []
server:
  port: 8321
  tls_certfile: null
  tls_keyfile: null

Error during container startup:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/local/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.10/site-packages/llama_stack/distribution/server/server.py", line 487, in <module>
    main()
  File "/usr/local/lib/python3.10/site-packages/llama_stack/distribution/server/server.py", line 388, in main
    impls = asyncio.run(construct_stack(config))
  File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/usr/local/lib/python3.10/site-packages/llama_stack/distribution/stack.py", line 219, in construct_stack
    impls = await resolve_impls(run_config, provider_registry or get_provider_registry(), dist_registry)
  File "/usr/local/lib/python3.10/site-packages/llama_stack/distribution/resolver.py", line 133, in resolve_impls
    return await instantiate_providers(sorted_providers, router_apis, dist_registry)
  File "/usr/local/lib/python3.10/site-packages/llama_stack/distribution/resolver.py", line 271, in instantiate_providers
    impl = await instantiate_provider(provider, deps, inner_impls, dist_registry)
  File "/usr/local/lib/python3.10/site-packages/llama_stack/distribution/resolver.py", line 356, in instantiate_provider
    impl = await fn(*args)
  File "/usr/local/lib/python3.10/site-packages/llama_stack/providers/remote/inference/vllm/__init__.py", line 11, in get_adapter_impl
    from .vllm import VLLMInferenceAdapter
  File "/usr/local/lib/python3.10/site-packages/llama_stack/providers/remote/inference/vllm/vllm.py", line 54, in <module>
    from llama_stack.providers.utils.inference.openai_compat import (
  File "/usr/local/lib/python3.10/site-packages/llama_stack/providers/utils/inference/openai_compat.py", line 92, in <module>
    from llama_stack.providers.utils.inference.prompt_adapter import (
  File "/usr/local/lib/python3.10/site-packages/llama_stack/providers/utils/inference/prompt_adapter.py", line 49, in <module>
    from llama_stack.models.llama.llama3.prompt_templates import (
  File "/usr/local/lib/python3.10/site-packages/llama_stack/models/llama/llama3/prompt_templates/__init__.py", line 14, in <module>
    from .base import PromptTemplate, PromptTemplateGeneratorBase  # noqa: F401
  File "/usr/local/lib/python3.10/site-packages/llama_stack/models/llama/llama3/prompt_templates/base.py", line 17, in <module>
    from jinja2 import Template
ModuleNotFoundError: No module named 'jinja2'
exit status 1

Error logs

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/local/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.10/site-packages/llama_stack/distribution/server/server.py", line 487, in <module>
    main()
  File "/usr/local/lib/python3.10/site-packages/llama_stack/distribution/server/server.py", line 388, in main
    impls = asyncio.run(construct_stack(config))
  File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/usr/local/lib/python3.10/site-packages/llama_stack/distribution/stack.py", line 219, in construct_stack
    impls = await resolve_impls(run_config, provider_registry or get_provider_registry(), dist_registry)
  File "/usr/local/lib/python3.10/site-packages/llama_stack/distribution/resolver.py", line 133, in resolve_impls
    return await instantiate_providers(sorted_providers, router_apis, dist_registry)
  File "/usr/local/lib/python3.10/site-packages/llama_stack/distribution/resolver.py", line 271, in instantiate_providers
    impl = await instantiate_provider(provider, deps, inner_impls, dist_registry)
  File "/usr/local/lib/python3.10/site-packages/llama_stack/distribution/resolver.py", line 356, in instantiate_provider
    impl = await fn(*args)
  File "/usr/local/lib/python3.10/site-packages/llama_stack/providers/remote/inference/vllm/__init__.py", line 11, in get_adapter_impl
    from .vllm import VLLMInferenceAdapter
  File "/usr/local/lib/python3.10/site-packages/llama_stack/providers/remote/inference/vllm/vllm.py", line 54, in <module>
    from llama_stack.providers.utils.inference.openai_compat import (
  File "/usr/local/lib/python3.10/site-packages/llama_stack/providers/utils/inference/openai_compat.py", line 92, in <module>
    from llama_stack.providers.utils.inference.prompt_adapter import (
  File "/usr/local/lib/python3.10/site-packages/llama_stack/providers/utils/inference/prompt_adapter.py", line 49, in <module>
    from llama_stack.models.llama.llama3.prompt_templates import (
  File "/usr/local/lib/python3.10/site-packages/llama_stack/models/llama/llama3/prompt_templates/__init__.py", line 14, in <module>
    from .base import PromptTemplate, PromptTemplateGeneratorBase  # noqa: F401
  File "/usr/local/lib/python3.10/site-packages/llama_stack/models/llama/llama3/prompt_templates/base.py", line 17, in <module>
    from jinja2 import Template
ModuleNotFoundError: No module named 'jinja2'
exit status 1

Expected behavior

The custom Llama Stack distribution image starts as a container and the endpoint becomes ready.

rchaganti avatar Mar 13 '25 07:03 rchaganti