dify icon indicating copy to clipboard operation
dify copied to clipboard

process_rule mode does not contemplate the hierarchical option

Open pr-maia opened this issue 1 year ago • 2 comments

Self Checks

  • [X] I have searched for existing issues search for existing issues, including closed ones.
  • [X] I confirm that I am using English to submit report (我已阅读并同意 Language Policy).
  • [X] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • [X] Please do not modify this template :) and fill in all the required fields.

Provide a description of requested docs changes

The documentation states that the mode parameter of process_rules can take on two values (automatic or custom) when in fact if we want to use the parent-child we should also change this to hierarchical.

If I use all the other parameters correctly, such as doc_form assuming hierarchical_model and parent_mode assuming paragraph, the document is still not processed as parent/child but as general.

image

pr-maia avatar Jan 08 '25 14:01 pr-maia

The process_rules function in the Dify codebase does indeed support a hierarchical mode. This mode is specifically configured within the getProcessRule function. When the hierarchical mode is selected, the function sets up a ProcessRule object that includes pre-processing rules, segmentation for parent chunks, and subchunk segmentation for child chunks. The configuration uses delimiters and maximum token lengths specified in parentChildConfig for both parent and child segments, and the parent_mode is determined by parentChildConfig.chunkForContext [1].

If you're experiencing issues with the hierarchical mode not functioning as expected, it might be worth checking the configuration of parentChildConfig and ensuring that all related parameters are correctly set. This includes verifying that doc_form is set to hierarchical_model and parent_mode is set to the desired value, such as paragraph. If these configurations are correct and the issue persists, it might be necessary to review the implementation details or any related modules that interact with process_rules.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

dosubot[bot] avatar Jan 08 '25 14:01 dosubot[bot]

We had a fix https://github.com/langgenius/dify/pull/12495 here. And we will upgrade the docs soon.

crazywoola avatar Jan 08 '25 14:01 crazywoola

Hi, @pr-maia. I'm Dosu, and I'm helping the Dify team manage their backlog. I'm marking this issue as stale.

Issue Summary:

  • You raised a concern about the missing "hierarchical" option in the process_rules function documentation.
  • I confirmed the support for the hierarchical mode and provided guidance.
  • @crazywoola mentioned a fix in a pull request to update the documentation.
  • You acknowledged the resolution with a thumbs-up reaction.

Next Steps:

  • Please confirm if this issue is still relevant to the latest version of the Dify repository. If so, feel free to comment to keep the discussion open.
  • If there are no further updates, this issue will be automatically closed in 15 days.

Thank you for your understanding and contribution!

dosubot[bot] avatar Feb 08 '25 16:02 dosubot[bot]