kotaemon icon indicating copy to clipboard operation
kotaemon copied to clipboard

[BUG] the usage issue of the graphrag feature

Open yuqiao9 opened this issue 1 year ago • 5 comments

Description

After I upload the file and execute graphrag collection, the error log is as follows. May I ask if I am missing any key steps. I have set up the api_key, set USE_CUSTOMIZED_GRAPHRAG_SETTING=true, and mounted settings.yaml.example

微信图片_20241029090953 微信图片_20241029090705 微信图片_20241029090546

Reproduction steps

1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error

Screenshots

![DESCRIPTION](LINK.png)

Logs

No response

Browsers

No response

OS

No response

Additional information

No response

yuqiao9 avatar Oct 29 '24 01:10 yuqiao9

I noticed there was an error when uploading the file. Could you please tell me what might be causing this 微信图片_20241029092302

yuqiao9 avatar Oct 29 '24 01:10 yuqiao9

微信图片_20241029155452 微信图片_20241029155523

yuqiao9 avatar Oct 29 '24 07:10 yuqiao9

Could you double check if:

  • You are using OpenAI LLM and has GRAPHRAG_API_KEY env var set.
  • You are using custom model through settings.yaml.example and make sure the URL & setting in there are correct.

If you use GraphRAG with Kotaemon Docker version you might need to change the host name to communicate with Ollama or other service on the host properly. See https://stackoverflow.com/questions/31324981/how-to-access-host-port-from-docker-container

taprosoft avatar Oct 30 '24 09:10 taprosoft

您能否仔细检查一下:

  • 您使用的是 OpenAI LLM 并设置了 GRAPHRAG_API_KEY 环境变量。
  • 您正在使用自定义模型,并确保那里的URL和设置正确。settings.yaml.example

如果您将 GraphRAG 与 Kotaemon Docker 版本一起使用,则可能需要更改主机名才能与主机上的 Ollama 或其他服务正确通信。查看 https://stackoverflow.com/questions/31324981/how-to-access-host-port-from-docker-container

I believe my path is configured correctly. Can you help me identify what might be wrong with my configuration?

encoding_model: cl100k_base skip_workflows: [] llm: api_key: ollama type: openai_chat # or azure_openai_chat api_base: http://192.168.8.101:11434/v1 model: llama3.1 model_supports_json: true # recommended if this is available for your model.

max_tokens: 4000

request_timeout: 1800.0

api_base: https://.openai.azure.com

api_version: 2024-02-15-preview

organization: <organization_id>

deployment_name: <azure_model_deployment_name>

tokens_per_minute: 150_000 # set a leaky bucket throttle

requests_per_minute: 10_000 # set a leaky bucket throttle

max_retries: 10

max_retry_wait: 10.0

sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times

concurrent_requests: 5 # the number of parallel inflight requests that may be made

temperature: 0 # temperature for sampling

top_p: 1 # top-p sampling

n: 1 # Number of completions to generate

parallelization: stagger: 0.3

num_threads: 50 # the number of threads to use for parallel processing

async_mode: threaded # or asyncio

embeddings:

parallelization: override the global parallelization settings for embeddings

async_mode: threaded # or asyncio

target: required # or all

batch_size: 16 # the number of documents to send in a single request

batch_max_tokens: 8191 # the maximum number of tokens to send in a single request

llm: api_base: http://192.168.8.101:11434/v1 api_key: ollama model: nomic-embed-text type: openai_embedding

yuqiao9 avatar Nov 05 '24 07:11 yuqiao9

微信图片_20241105151513

yuqiao9 avatar Nov 05 '24 07:11 yuqiao9