[Question]:并行处理文档
Do you need to ask a question?
- [x] I have searched the existing question and discussions and this question is not already answered.
- [x] I believe this is a legitimate question, not just a bug or feature request.
Your Question
对于文档上传、构建图谱这里,配置文件中给出的建议是2-10个并行处理文件。这个是有什么依据吗? 能否多开几个实例,相对应的这个2-10并行处理文件是否也会增加?
Additional Context
No response
"There are two key parameters that control pipeline concurrency: the maximum number of files processed in parallel (MAX_PARALLEL_INSERT) and the maximum number of concurrent LLM requests (MAX_ASYNC). The workflow is described as follows:
MAX_ASYNC limits the total number of concurrent LLM requests in the system, including those for querying, extraction, and merging. LLM requests have different priorities: query operations have the highest priority, followed by merging, and then extraction. MAX_PARALLEL_INSERT controls the number of files processed in parallel during the extraction stage. For optimal performance, MAX_PARALLEL_INSERT is recommended to be set between 2 and 10, typically MAX_ASYNC/3. Setting this value too high can increase the likelihood of naming conflicts among entities and relationships across different documents during the merge phase, thereby reducing its overall efficiency. Within a single file, entity and relationship extractions from different text blocks are processed concurrently, with the degree of concurrency set by MAX_ASYNC. Only after MAX_ASYNC text blocks are processed will the system proceed to the next batch within the same file. When a file completes entity and relationship extraction, it enters the entity and relationship merging stage. This stage also processes multiple entities and relationships concurrently, with the concurrency level also controlled by MAX_ASYNC."
"The bottleneck of document indexing speed mainly lies with the LLM. If your LLM supports high concurrency, you can accelerate document indexing by increasing the concurrency level of the LLM. Below are several environment variables related to concurrent processing, along with their default values:"
Number of worker processes, not greater than (2 x number_of_cores) + 1 WORKERS=2 Number of parallel files to process in one batch MAX_PARALLEL_INSERT=2 Max concurrent requests to the LLM MAX_ASYNC=4
The performance bottleneck in document processing lies in the LLM's computational capacity and concurrency limits. A single LightRAG instance is typically sufficient to fully utilize the majority of LLMs' performance capabilities. However, when the concurrency level of LightRAG significantly exceeds the LLM’s actual concurrent processing capacity, system congestion may occur, leading to degraded performance and adverse effects.
https://github.com/HKUDS/LightRAG/blob/main/docs/LightRAG_concurrent_explain.md
文档处理的性能瓶颈在于 LLM 的计算能力和并发限制。单个 LightRAG 实例通常足以充分利用大多数 LLM 的性能。然而,当 LightRAG 的并发水平明显超过 LLM 的实际并发处理能力时,可能会出现系统拥塞,导致性能下降和不利影响。
https://github.com/HKUDS/LightRAG/blob/main/docs/LightRAG_concurrent_explain.md
好的,感谢您的回复
“控制管道并发性的关键参数有两个:并行处理的最大文件数 (MAX_PARALLEL_INSERT) 和最大并发 LLM 请求数 (MAX_ASYNC)。该工作流程描述如下:
MAX_ASYNC限制了系统中并发的 LLM 请求总数,包括用于查询、提取和合并的请求。LLM 请求具有不同的优先级:查询作具有最高优先级,其次是合并,然后是提取。 MAX_PARALLEL_INSERT控制在提取阶段并行处理的文件数。为了获得最佳性能,建议将MAX_PARALLEL_INSERT设置为 2 到 10 之间,通常为 MAX_ASYNC/3。将此值设置得太高会增加合并阶段实体之间命名冲突的可能性以及不同文档之间的关系,从而降低其整体效率。在单个文件中,从不同文本块提取的实体和关系提取将同时处理,并发程度由MAX_ASYNC设置。只有在处理完MAX_ASYNC文本块后,系统才会继续进行同一文件中的下一批。当文件完成实体和关系提取后,它将进入实体和关系合并阶段。此阶段还同时处理多个实体和关系,并发级别也由MAX_ASYNC控制。
“文档索引速度的瓶颈主要在于 LLM。如果您的 LLM 支持高并发性,您可以通过提高 LLM 的并发级别来加速文档索引。以下是与并发处理相关的几个环境变量及其默认值:”
工作进程数,不大于 (2 x number_of_cores) + 1 WORKERS=2 一批处理的并行文件数 MAX_PARALLEL_INSERT=2 对 LLM 的最大并发请求数 MAX_ASYNC=4
了解了,谢谢您的解答