data-juicer icon indicating copy to clipboard operation
data-juicer copied to clipboard

Guidance for OP with multiple data fields to be processed

Open yxdyc opened this issue 1 year ago • 2 comments

Search before continuing 先搜索,再继续

  • [X] I have searched the Data-Juicer issues and found no similar feature requests. 我已经搜索了 Data-Juicer 的 issue 列表但是没有发现类似的功能需求。

Description 描述

Currently, users may be confused about supporting multiple fields for a given OP. For example, developing a OP that processes both text_key="question" and text_key="answer".

Besides, we need to add some guidance about the type of text related keys, e.g., must be str, rather than a list or dict, for the sake of efficiency and coding convenience (implicit assumptions for all text-related OPs).

Use case 使用场景

related issue: https://github.com/modelscope/data-juicer/issues/380

Additional 额外信息

No response

Are you willing to submit a PR for this feature? 您是否乐意为此功能提交一个 PR?

  • [X] Yes I'd like to help by submitting a PR! 是的!我愿意提供帮助并提交一个PR!

yxdyc avatar Sep 02 '24 11:09 yxdyc

This issue is marked as stale because there has been no activity for 21 days. Remove stale label or add new comments or this issue will be closed in 3 day.

github-actions[bot] avatar Sep 24 '24 09:09 github-actions[bot]

Close this stale issue.

github-actions[bot] avatar Sep 28 '24 09:09 github-actions[bot]