WrenAI icon indicating copy to clipboard operation
WrenAI copied to clipboard

Unable to spin up Wren AI Service - With 'meta-llama-3.2-1b-instruct' (LLM Studio)

Open prathameshbelurkar opened this issue 11 months ago • 2 comments

I am using 'meta-llama-3.2-1b-instruct' which is downloaded in LLM studio.


.wrenai/.env

COMPOSE_PROJECT_NAME=wrenai
PLATFORM=linux/amd64

PROJECT_DIR=.

# service port
WREN_ENGINE_PORT=8080
WREN_ENGINE_SQL_PORT=7432
WREN_AI_SERVICE_PORT=5555
WREN_UI_PORT=3000
IBIS_SERVER_PORT=8000
WREN_UI_ENDPOINT=http://wren-ui:${WREN_UI_PORT}

# ai service settings
QDRANT_HOST=qdrant
SHOULD_FORCE_DEPLOY=1

# vendor keys
LLM_OPENAI_API_KEY=
EMBEDDER_OPENAI_API_KEY=
LLM_AZURE_OPENAI_API_KEY=
EMBEDDER_AZURE_OPENAI_API_KEY=
QDRANT_API_KEY=

# version
# CHANGE THIS TO THE LATEST VERSION
WREN_PRODUCT_VERSION=0.14.0
WREN_ENGINE_VERSION=0.13.1
WREN_AI_SERVICE_VERSION=0.14.0
IBIS_SERVER_VERSION=0.13.1
WREN_UI_VERSION=0.19.1
WREN_BOOTSTRAP_VERSION=0.1.5

# user id (uuid v4)
USER_UUID=

# for other services
POSTHOG_API_KEY=phc_nhF32aj4xHXOZb0oqr2cn4Oy9uiWzz6CCP4KZmRq9aE
POSTHOG_HOST=https://app.posthog.com
TELEMETRY_ENABLED=true
# this is for telemetry to know the model, i think ai-service might be able to provide a endpoint to get the information
GENERATION_MODEL=gpt-4o-mini
LANGFUSE_SECRET_KEY=
LANGFUSE_PUBLIC_KEY=

# the port exposes to the host
# OPTIONAL: change the port if you have a conflict
HOST_PORT=3000
AI_SERVICE_FORWARD_PORT=5555

# Wren UI
EXPERIMENTAL_ENGINE_RUST_VERSION=false

LLM_LM_STUDIO_API_KEY=random

.wrenai/config.yaml

type: llm
provider: litellm_llm
timeout: 120
models:
# omitted other model definitions
- kwargs:
    n: 1
    temperature: 0
    response_format:
      type: json_object
  # please replace with your model name here, should be lm_studio/<MODEL_NAME>
  model: lm_studio/mlx-community/meta-llama-3.2-1b-instruct
  api_base: http://host.docker.internal:1234/v1
  api_key_name: LLM_LM_STUDIO_API_KEY

---
type: embedder
provider: ollama_embedder
models:
  - model: nomic-embed-text
    dimension: 768
url: http://localhost:11434
timeout: 120

---
type: engine
provider: wren_ui
endpoint: http://localhost:3000

---
type: engine
provider: wren_ibis
endpoint: http://localhost:8000
source: bigquery
manifest: "" # base64 encoded string of the MDL
connection_info: "" # base64 encoded string of the connection info

---
type: engine
provider: wren_engine
endpoint: http://localhost:8080
manifest: ""

---
type: document_store
provider: qdrant
location: http://qdrant:6333
embedding_model_dim: 768
timeout: 120
recreate_index: true

---
type: pipeline
pipes:
  - name: db_schema_indexing
    embedder: openai_embedder.text-embedding-3-large
    document_store: qdrant
  - name: historical_question_indexing
    embedder: openai_embedder.text-embedding-3-large
    document_store: qdrant
  - name: table_description_indexing
    embedder: openai_embedder.text-embedding-3-large
    document_store: qdrant
  - name: db_schema_retrieval
    llm: litellm_llm.gpt-4o-mini-2024-07-18
    embedder: openai_embedder.text-embedding-3-large
    document_store: qdrant
  - name: historical_question_retrieval
    embedder: openai_embedder.text-embedding-3-large
    document_store: qdrant
  - name: sql_generation
    llm: litellm_llm.gpt-4o-mini-2024-07-18
    engine: wren_ui
  - name: sql_correction
    llm: litellm_llm.gpt-4o-mini-2024-07-18
    engine: wren_ui
  - name: followup_sql_generation
    llm: litellm_llm.gpt-4o-mini-2024-07-18
    engine: wren_ui
  - name: sql_summary
    llm: litellm_llm.gpt-4o-mini-2024-07-18
  - name: sql_answer
    llm: litellm_llm.gpt-4o-mini-2024-07-18
    engine: wren_ui
  - name: sql_breakdown
    llm: litellm_llm.gpt-4o-mini-2024-07-18
    engine: wren_ui
  - name: sql_expansion
    llm: litellm_llm.gpt-4o-mini-2024-07-18
    engine: wren_ui
  - name: sql_explanation
    llm: litellm_llm.gpt-4o-mini-2024-07-18
  - name: sql_regeneration
    llm: litellm_llm.gpt-4o-mini-2024-07-18
    engine: wren_ui
  - name: semantics_description
    llm: litellm_llm.gpt-4o-mini-2024-07-18
  - name: relationship_recommendation
    llm: litellm_llm.gpt-4o-mini-2024-07-18
    engine: wren_ui
  - name: question_recommendation
    llm: litellm_llm.gpt-4o-mini-2024-07-18
  - name: question_recommendation_db_schema_retrieval
    llm: litellm_llm.gpt-4o-mini-2024-07-18
    embedder: openai_embedder.text-embedding-3-large
    document_store: qdrant
  - name: question_recommendation_sql_generation
    llm: litellm_llm.gpt-4o-mini-2024-07-18
    engine: wren_ui
  - name: chart_generation
    llm: litellm_llm.gpt-4o-mini-2024-07-18
  - name: chart_adjustment
    llm: litellm_llm.gpt-4o-mini-2024-07-18
  - name: intent_classification
    llm: litellm_llm.gpt-4o-mini-2024-07-18
    embedder: openai_embedder.text-embedding-3-large
    document_store: qdrant
  - name: data_assistance
    llm: litellm_llm.gpt-4o-mini-2024-07-18
  - name: sql_pairs_indexing
    document_store: qdrant
    embedder: openai_embedder.text-embedding-3-large
  - name: sql_pairs_deletion
    document_store: qdrant
    embedder: openai_embedder.text-embedding-3-large 
  - name: sql_pairs_retrieval
    document_store: qdrant
    embedder: openai_embedder.text-embedding-3-large
    llm: litellm_llm.gpt-4o-mini-2024-07-18
  - name: preprocess_sql_data
    llm: litellm_llm.gpt-4o-mini-2024-07-18
  - name: sql_executor
    engine: wren_ui
  - name: sql_question_generation
    llm: litellm_llm.gpt-4o-mini-2024-07-18
  - name: sql_generation_reasoning
    llm: litellm_llm.gpt-4o-mini-2024-07-18

---
settings:
  host: 127.0.0.1
  port: 5556
  column_indexing_batch_size: 50
  table_retrieval_size: 10
  table_column_retrieval_size: 100
  allow_using_db_schemas_without_pruning: false
  query_cache_maxsize: 1000
  query_cache_ttl: 3600
  langfuse_host: https://cloud.langfuse.com
  langfuse_enable: true
  logging_level: DEBUG
  development: true

Getting this error, when error in wrenai-wren-ai-service-1 container

2025-02-09 15:17:33 INFO:     Started server process [8]
2025-02-09 15:17:33 INFO:     Waiting for application startup.
2025-02-09 15:17:33 I0209 09:47:33.250 8 wren-ai-service:42] Imported Provider: src.providers.document_store
2025-02-09 15:17:27 Timeout: wren-ai-service did not start within 60 seconds
2025-02-09 15:17:28 Waiting for qdrant to start...
2025-02-09 15:17:28 qdrant has started.
2025-02-09 15:17:28 Waiting for wren-ai-service to start...
2025-02-09 15:17:33 I0209 09:47:33.970 8 wren-ai-service:66] Registering provider: openai_embedder
2025-02-09 15:17:33 I0209 09:47:33.971 8 wren-ai-service:66] Registering provider: qdrant
2025-02-09 15:17:33 I0209 09:47:33.974 8 wren-ai-service:42] Imported Provider: src.providers.document_store.qdrant
2025-02-09 15:17:33 I0209 09:47:33.975 8 wren-ai-service:42] Imported Provider: src.providers.embedder
2025-02-09 15:17:33 I0209 09:47:33.978 8 wren-ai-service:66] Registering provider: azure_openai_embedder
2025-02-09 15:17:33 I0209 09:47:33.978 8 wren-ai-service:42] Imported Provider: src.providers.embedder.azure_openai
2025-02-09 15:17:33 I0209 09:47:33.981 8 wren-ai-service:66] Registering provider: ollama_embedder
2025-02-09 15:17:33 I0209 09:47:33.981 8 wren-ai-service:42] Imported Provider: src.providers.embedder.ollama
2025-02-09 15:17:33 I0209 09:47:33.982 8 wren-ai-service:42] Imported Provider: src.providers.embedder.openai
2025-02-09 15:17:33 I0209 09:47:33.982 8 wren-ai-service:42] Imported Provider: src.providers.engine
2025-02-09 15:17:33 I0209 09:47:33.984 8 wren-ai-service:66] Registering provider: wren_ui
2025-02-09 15:17:33 I0209 09:47:33.984 8 wren-ai-service:66] Registering provider: wren_ibis
2025-02-09 15:17:33 I0209 09:47:33.984 8 wren-ai-service:66] Registering provider: wren_engine
2025-02-09 15:17:33 I0209 09:47:33.984 8 wren-ai-service:42] Imported Provider: src.providers.engine.wren
2025-02-09 15:17:33 I0209 09:47:33.985 8 wren-ai-service:42] Imported Provider: src.providers.llm
2025-02-09 15:17:34 I0209 09:47:34.000 8 wren-ai-service:66] Registering provider: azure_openai_llm
2025-02-09 15:17:34 I0209 09:47:34.000 8 wren-ai-service:42] Imported Provider: src.providers.llm.azure_openai
2025-02-09 15:17:34 /app/.venv/lib/python3.12/site-packages/pydantic/_internal/_config.py:345: UserWarning: Valid config keys have changed in V2:
2025-02-09 15:17:34 * 'fields' has been removed
2025-02-09 15:17:34   warnings.warn(message, UserWarning)
2025-02-09 15:17:35 I0209 09:47:35.742 8 wren-ai-service:66] Registering provider: litellm_llm
2025-02-09 15:17:35 I0209 09:47:35.743 8 wren-ai-service:42] Imported Provider: src.providers.llm.litellm
2025-02-09 15:17:35 I0209 09:47:35.746 8 wren-ai-service:66] Registering provider: ollama_llm
2025-02-09 15:17:35 I0209 09:47:35.747 8 wren-ai-service:42] Imported Provider: src.providers.llm.ollama
2025-02-09 15:17:35 I0209 09:47:35.843 8 wren-ai-service:66] Registering provider: openai_llm
2025-02-09 15:17:35 I0209 09:47:35.843 8 wren-ai-service:42] Imported Provider: src.providers.llm.openai
2025-02-09 15:17:35 I0209 09:47:35.844 8 wren-ai-service:42] Imported Provider: src.providers.loader
2025-02-09 15:17:35 I0209 09:47:35.844 8 wren-ai-service:18] initializing provider: ollama_embedder
2025-02-09 15:17:35 I0209 09:47:35.844 8 wren-ai-service:93] Getting provider: ollama_embedder from {'openai_embedder': <class 'src.providers.embedder.openai.OpenAIEmbedderProvider'>, 'qdrant': <class 'src.providers.document_store.qdrant.QdrantProvider'>, 'azure_openai_embedder': <class 'src.providers.embedder.azure_openai.AzureOpenAIEmbedderProvider'>, 'ollama_embedder': <class 'src.providers.embedder.ollama.OllamaEmbedderProvider'>, 'wren_ui': <class 'src.providers.engine.wren.WrenUI'>, 'wren_ibis': <class 'src.providers.engine.wren.WrenIbis'>, 'wren_engine': <class 'src.providers.engine.wren.WrenEngine'>, 'azure_openai_llm': <class 'src.providers.llm.azure_openai.AzureOpenAILLMProvider'>, 'litellm_llm': <class 'src.providers.llm.litellm.LitellmLLMProvider'>, 'ollama_llm': <class 'src.providers.llm.ollama.OllamaLLMProvider'>, 'openai_llm': <class 'src.providers.llm.openai.OpenAILLMProvider'>}
2025-02-09 15:17:35 ERROR:    Traceback (most recent call last):
2025-02-09 15:17:35   File "/app/.venv/lib/python3.12/site-packages/httpx/_transports/default.py", line 72, in map_httpcore_exceptions
2025-02-09 15:17:35     yield
2025-02-09 15:17:35   File "/app/.venv/lib/python3.12/site-packages/httpx/_transports/default.py", line 236, in handle_request
2025-02-09 15:17:35     resp = self._pool.handle_request(req)
2025-02-09 15:17:35            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-02-09 15:17:35   File "/app/.venv/lib/python3.12/site-packages/httpcore/_sync/connection_pool.py", line 256, in handle_request
2025-02-09 15:17:35     raise exc from None
2025-02-09 15:17:35   File "/app/.venv/lib/python3.12/site-packages/httpcore/_sync/connection_pool.py", line 236, in handle_request
2025-02-09 15:17:35     response = connection.handle_request(
2025-02-09 15:17:35                ^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-02-09 15:17:35   File "/app/.venv/lib/python3.12/site-packages/httpcore/_sync/connection.py", line 101, in handle_request
2025-02-09 15:17:35     raise exc
2025-02-09 15:17:35   File "/app/.venv/lib/python3.12/site-packages/httpcore/_sync/connection.py", line 78, in handle_request
2025-02-09 15:17:35     stream = self._connect(request)
2025-02-09 15:17:35              ^^^^^^^^^^^^^^^^^^^^^^
2025-02-09 15:17:35   File "/app/.venv/lib/python3.12/site-packages/httpcore/_sync/connection.py", line 124, in _connect
2025-02-09 15:17:35     stream = self._network_backend.connect_tcp(**kwargs)
2025-02-09 15:17:35              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-02-09 15:17:35   File "/app/.venv/lib/python3.12/site-packages/httpcore/_backends/sync.py", line 207, in connect_tcp
2025-02-09 15:17:35     with map_exceptions(exc_map):
2025-02-09 15:17:35   File "/usr/local/lib/python3.12/contextlib.py", line 155, in __exit__
2025-02-09 15:17:35     self.gen.throw(value)
2025-02-09 15:17:35   File "/app/.venv/lib/python3.12/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
2025-02-09 15:17:35     raise to_exc(exc) from exc
2025-02-09 15:17:35 httpcore.ConnectError: [Errno 111] Connection refused
2025-02-09 15:17:35 
2025-02-09 15:17:35 The above exception was the direct cause of the following exception:
2025-02-09 15:17:35 
2025-02-09 15:17:35 Traceback (most recent call last):
2025-02-09 15:17:35   File "/app/.venv/lib/python3.12/site-packages/starlette/routing.py", line 693, in lifespan
2025-02-09 15:17:35     async with self.lifespan_context(app) as maybe_state:
2025-02-09 15:17:35   File "/usr/local/lib/python3.12/contextlib.py", line 204, in __aenter__
2025-02-09 15:17:35     return await anext(self.gen)
2025-02-09 15:17:35            ^^^^^^^^^^^^^^^^^^^^^
2025-02-09 15:17:35   File "/app/.venv/lib/python3.12/site-packages/fastapi/routing.py", line 133, in merged_lifespan
2025-02-09 15:17:35     async with original_context(app) as maybe_original_state:
2025-02-09 15:17:35   File "/usr/local/lib/python3.12/contextlib.py", line 204, in __aenter__
2025-02-09 15:17:35     return await anext(self.gen)
2025-02-09 15:17:35            ^^^^^^^^^^^^^^^^^^^^^
2025-02-09 15:17:35   File "/app/.venv/lib/python3.12/site-packages/fastapi/routing.py", line 133, in merged_lifespan
2025-02-09 15:17:35     async with original_context(app) as maybe_original_state:
2025-02-09 15:17:35   File "/usr/local/lib/python3.12/contextlib.py", line 204, in __aenter__
2025-02-09 15:17:35     return await anext(self.gen)
2025-02-09 15:17:35            ^^^^^^^^^^^^^^^^^^^^^
2025-02-09 15:17:35   File "/src/__main__.py", line 29, in lifespan
2025-02-09 15:17:35     pipe_components = generate_components(settings.components)
2025-02-09 15:17:35                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-02-09 15:17:35   File "/src/providers/__init__.py", line 395, in generate_components
2025-02-09 15:17:35     identifier: provider_factory(config)
2025-02-09 15:17:35                 ^^^^^^^^^^^^^^^^^^^^^^^^
2025-02-09 15:17:35   File "/src/providers/__init__.py", line 19, in provider_factory
2025-02-09 15:17:35     return loader.get_provider(config.get("provider"))(**config)
2025-02-09 15:17:35            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-02-09 15:17:35   File "/src/providers/embedder/ollama.py", line 178, in __init__
2025-02-09 15:17:35     pull_ollama_model(self._url, self._embedding_model)
2025-02-09 15:17:35   File "/src/providers/loader.py", line 107, in pull_ollama_model
2025-02-09 15:17:35     models = [model["name"] for model in client.list()["models"]]
2025-02-09 15:17:35                                          ^^^^^^^^^^^^^
2025-02-09 15:17:35   File "/app/.venv/lib/python3.12/site-packages/ollama/_client.py", line 333, in list
2025-02-09 15:17:35     return self._request('GET', '/api/tags').json()
2025-02-09 15:17:35            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-02-09 15:17:35   File "/app/.venv/lib/python3.12/site-packages/ollama/_client.py", line 69, in _request
2025-02-09 15:17:35     response = self._client.request(method, url, **kwargs)
2025-02-09 15:17:35                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-02-09 15:17:35   File "/app/.venv/lib/python3.12/site-packages/httpx/_client.py", line 837, in request
2025-02-09 15:17:35     return self.send(request, auth=auth, follow_redirects=follow_redirects)
2025-02-09 15:17:35            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-02-09 15:17:35   File "/app/.venv/lib/python3.12/site-packages/httpx/_client.py", line 926, in send
2025-02-09 15:17:35     response = self._send_handling_auth(
2025-02-09 15:17:35                ^^^^^^^^^^^^^^^^^^^^^^^^^
2025-02-09 15:17:35   File "/app/.venv/lib/python3.12/site-packages/httpx/_client.py", line 954, in _send_handling_auth
2025-02-09 15:17:35     response = self._send_handling_redirects(
2025-02-09 15:17:35                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-02-09 15:17:35   File "/app/.venv/lib/python3.12/site-packages/httpx/_client.py", line 991, in _send_handling_redirects
2025-02-09 15:17:35     response = self._send_single_request(request)
2025-02-09 15:17:35                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-02-09 15:17:35   File "/app/.venv/lib/python3.12/site-packages/httpx/_client.py", line 1027, in _send_single_request
2025-02-09 15:17:35     response = transport.handle_request(request)
2025-02-09 15:17:35                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-02-09 15:17:35   File "/app/.venv/lib/python3.12/site-packages/httpx/_transports/default.py", line 235, in handle_request
2025-02-09 15:17:35     with map_httpcore_exceptions():
2025-02-09 15:17:35   File "/usr/local/lib/python3.12/contextlib.py", line 155, in __exit__
2025-02-09 15:17:35     self.gen.throw(value)
2025-02-09 15:17:35   File "/app/.venv/lib/python3.12/site-packages/httpx/_transports/default.py", line 89, in map_httpcore_exceptions
2025-02-09 15:17:35     raise mapped_exc(message) from exc
2025-02-09 15:17:35 httpx.ConnectError: [Errno 111] Connection refused
2025-02-09 15:17:35 
2025-02-09 15:17:35 ERROR:    Application startup failed. Exiting.

prathameshbelurkar avatar Feb 09 '25 09:02 prathameshbelurkar

Are there any missing configurations in the .env or config.yaml files? Please help!

prathameshbelurkar avatar Feb 09 '25 09:02 prathameshbelurkar

Are there any missing configurations in the .env or config.yaml files? Please help!

Please check this for reference: https://github.com/Canner/WrenAI/blob/main/wren-ai-service/docs/config_examples/config.ollama.yaml

cyyeh avatar Feb 09 '25 09:02 cyyeh

@cyyeh Thanks for the help! I'll refer to this. I'm not working on this issue at the moment, so I'll close it for now.

prathameshbelurkar avatar Mar 08 '25 08:03 prathameshbelurkar