distilabel
distilabel copied to clipboard
[FEATURE] Add quick annotation guidelines
Is your feature request related to a problem? Please describe. In the generated dataset we're saying rate following the annotation guidelines but they are empty.
Describe the solution you'd like We should include a very brief annotation guideline when setting up the dataset. Maybe reuse the general parts of the UF prompt?
Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
Additional context Add any other context or screenshots about the feature request here.
Maybe adding default annotation guidelines can be complex if we want this to be extensible, so the easiest may be ~to just add the task definition within the guidelines i.e. "rate the quality of the responses assuming that those were generated using the following prompt template" or something similar;~ but I think we can also add Rate ... given instruction based on the annotation guidelines if any.
(i.e. adding if any
at the end)
Edit: forget what I just said, we have no traceability on which task was used, could be done in a hacky way but IMO not worth it; it would be better to just expose that within the init so that the person can set it instead.
Maybe adding default annotation guidelines can be complex if we want this to be extensible, so the easiest may be ~to just add the task definition within the guidelines i.e. "rate the quality of the responses assuming that those were generated using the following prompt template" or something similar;~ but I think we can also add
Rate ... given instruction based on the annotation guidelines if any.
(i.e. addingif any
at the end)Edit: forget what I just said, we have no traceability on which task was used, could be done in a hacky way but IMO not worth it; it would be better to just expose that within the init so that the person can set it instead.
No. We know is a preference dataset. I'm just talking about something like:
Rate the quality of the responses to the instructions based on aspects like .... (that's what I meant by reusing some language of the UF prompt).
Either that or simply remove the mention to the guidelines in the questions:
Rate generations-0 given the instruction *
No. We know is a preference dataset. I'm just talking about something like:
Rate the quality of the responses to the instructions based on aspects like .... (that's what I meant by reusing some language of the UF prompt).
Fair! Do you prefer that over simply removing the mention to the guidelines? Otherwise, do you have something in mind i.e. that works for most of the use cases? Otherwise we can just remove those to avoid confusion, as both the question titles and guidelines can be later edited within the Argilla UI already 👍🏻
No. We know is a preference dataset. I'm just talking about something like: Rate the quality of the responses to the instructions based on aspects like .... (that's what I meant by reusing some language of the UF prompt).
Fair! Do you prefer that over simply removing the mention to the guidelines? Otherwise, do you have something in mind i.e. that works for most of the use cases? Otherwise we can just remove those to avoid confusion, as both the question titles and guidelines can be later edited within the Argilla UI already 👍🏻
Apologies for the late reply. I agree the simplest and more maintainable is to remove the ref to the guidelines in the question titles.
No worries at all 👍🏻 I'll create the PR now and invite you to review, thanks for the feedback!