WrenAI icon indicating copy to clipboard operation
WrenAI copied to clipboard

部署成功,但是响应速度很慢,请问是什么问题?如何改进?

Open starise-wg opened this issue 4 months ago • 5 comments

1.环境配置 Linux中部署,RTX 2080Ti(22GB) 22GB * 1,接入的是ollama,大模型是llama3:8b和nomic-embed-text:latest 2.问题 我只导入了一张表,然后在输入框问问题,但是响应速度很慢。请问是什么问题?如何改进? Image

starise-wg avatar Sep 03 '25 03:09 starise-wg

你好,我和你电脑配置类似,你的config.yaml如何配置的是否可以展示。

icefragrant avatar Sep 06 '25 05:09 icefragrant

你好,我和你电脑配置类似,你的config.yaml如何配置的是否可以展示。

type: llm provider: litellm_llm timeout: 1800 models:

  • api_base: http://125.77.155.56:11434 model: ollama_chat/llama3:8b alias: default timeout: 1800 kwargs: num_ctx: 2048
    max_tokens: 4096 n: 1 temperature: 0
  • model: gpt-4.1-mini-2025-04-14 context_window_size: 1000000 kwargs: max_tokens: 4096 n: 1 seed: 0 temperature: 0
  • model: gpt-4.1-2025-04-14 context_window_size: 1000000 kwargs: max_tokens: 4096 n: 1 seed: 0 temperature: 0
  • model: gpt-5-nano-2025-08-07 context_window_size: 380000 kwargs: max_completion_tokens: 4096 n: 1 seed: 0 reasoning_effort: minimal
  • model: gpt-5-mini-2025-08-07 context_window_size: 380000 kwargs: max_completion_tokens: 4096 n: 1 seed: 0 reasoning_effort: minimal
  • model: gpt-5-2025-08-07 context_window_size: 380000 kwargs: max_completion_tokens: 4096 n: 1 seed: 0 reasoning_effort: minimal

type: embedder provider: litellm_embedder models:

  • model: ollama/nomic-embed-text:latest alias: default api_base: http://125.77.155.56:11434 timeout: 600

type: engine provider: wren_ui endpoint: http://wren-ui:3000


type: engine provider: wren_ibis endpoint: http://ibis-server:8000


type: document_store provider: qdrant location: http://qdrant:6333 embedding_model_dim: 768 timeout: 120 recreate_index: true


type: pipeline pipes:

  • name: db_schema_indexing embedder: litellm_embedder.default document_store: qdrant
  • name: historical_question_indexing embedder: litellm_embedder.default document_store: qdrant
  • name: table_description_indexing embedder: litellm_embedder.default document_store: qdrant
  • name: db_schema_retrieval llm: litellm_llm.default embedder: litellm_embedder.default document_store: qdrant
  • name: historical_question_retrieval embedder: litellm_embedder.default document_store: qdrant
  • name: sql_generation llm: litellm_llm.default engine: wren_ui document_store: qdrant
  • name: sql_correction llm: litellm_llm.default engine: wren_ui document_store: qdrant
  • name: followup_sql_generation llm: litellm_llm.default engine: wren_ui document_store: qdrant
  • name: sql_answer llm: litellm_llm.default
  • name: semantics_description llm: litellm_llm.default
  • name: relationship_recommendation llm: litellm_llm.default
  • name: question_recommendation llm: litellm_llm.default
  • name: question_recommendation_sql_generation llm: litellm_llm.default engine: wren_ui document_store: qdrant
  • name: intent_classification llm: litellm_llm.default embedder: litellm_embedder.default document_store: qdrant
  • name: misleading_assistance llm: litellm_llm.default
  • name: data_assistance llm: litellm_llm.default
  • name: sql_pairs_indexing document_store: qdrant embedder: litellm_embedder.default
  • name: sql_pairs_retrieval document_store: qdrant embedder: litellm_embedder.default llm: litellm_llm.default
  • name: preprocess_sql_data llm: litellm_llm.default
  • name: sql_executor engine: wren_ui
  • name: chart_generation llm: litellm_llm.default
  • name: chart_adjustment llm: litellm_llm.default
  • name: user_guide_assistance llm: litellm_llm.default
  • name: sql_question_generation llm: litellm_llm.default
  • name: sql_generation_reasoning llm: litellm_llm.default
  • name: followup_sql_generation_reasoning llm: litellm_llm.default
  • name: sql_regeneration llm: litellm_llm.default engine: wren_ui
  • name: instructions_indexing embedder: litellm_embedder.default document_store: qdrant
  • name: instructions_retrieval embedder: litellm_embedder.default document_store: qdrant
  • name: sql_functions_retrieval engine: wren_ibis document_store: qdrant
  • name: project_meta_indexing document_store: qdrant
  • name: sql_tables_extraction llm: litellm_llm.default
  • name: question_recommendation_db_schema_retrieval llm: litellm_llm.default embedder: litellm_embedder.default document_store: qdrant

settings: doc_endpoint: https://docs.getwren.ai is_oss: true engine_timeout: 30 column_indexing_batch_size: 50 table_retrieval_size: 10 table_column_retrieval_size: 100 allow_intent_classification: true allow_sql_generation_reasoning: true allow_sql_functions_retrieval: true enable_column_pruning: false max_sql_correction_retries: 3 query_cache_maxsize: 1000 query_cache_ttl: 3600 langfuse_host: https://cloud.langfuse.com langfuse_enable: true logging_level: DEBUG development: false historical_question_retrieval_similarity_threshold: 0.9 sql_pairs_similarity_threshold: 0.7 sql_pairs_retrieval_max_size: 10 instructions_similarity_threshold: 0.7 instructions_top_k: 10

starise-wg avatar Sep 06 '25 06:09 starise-wg

感谢!

icefragrant avatar Sep 06 '25 07:09 icefragrant

感谢!

你那的响应时间怎么样

starise-wg avatar Sep 06 '25 07:09 starise-wg

我这边启动后,无法发起对话,

Image Image

配置用的跟你差不多

jenchih avatar Oct 29 '25 09:10 jenchih