dify icon indicating copy to clipboard operation
dify copied to clipboard

Batch Execution Management and Knowledge-Base-Writing Node for Efficient Dataset Synthesis

Open hagemon opened this issue 6 months ago • 0 comments

Self Checks

  • [X] I have searched for existing issues search for existing issues, including closed ones.
  • [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [X] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • [X] Please do not modify this template :) and fill in all the required fields.

1. Is this request related to a challenge you're experiencing? Tell me about your story.

In my daily work, I often need to use larger models to synthesize a large amount of data to enrich my dataset. This dataset is then used for fine-tuning smaller models or testing RAG capabilities.

Currently, Dify provides a batch execution feature, but it seems there is no management page that allows me to stop batch tasks at any time, view historical tasks, and manage the generated results. I achieve these functionalities through scripts, but I'm looking for a more streamlined, end-to-end experience.

2. Additional context or comments

I propose three main features:

  1. Batch Execution Management: An entry on the workflow orchestration page that leads to a batch management page. Here, users can observe batch tasks run in the background, with real-time updates on a dashboard. Users can view and stop tasks at any time, obtain current execution results, and view the history of executed tasks.

  2. Knowledge-Base-Writing Node: A node where the input can be vectors output by the LLM embedding model. In this node, users can specify the knowledge base to save to, and after execution, the vectors and origin text are added to the knowledge base.

  3. The Combination of Both Functions: Each batch task will generate a single document in the knowledge base, and the text generated each time (along with the corresponding vector) corresponds to a paragraph in the document.

I would like to contribute code, provide feedback, and assist with testing to ensure the feature is implemented effectively and meets user needs.

3. Can you help us with this feature?

  • [X] I am interested in contributing to this feature.

hagemon avatar Aug 05 '24 13:08 hagemon