Open-Assistant
Open-Assistant copied to clipboard
Redesign visual presentation for multi-labeling task
Based on initial user feedback, the multi-labeling task is ill defined:
Some feedback:
- Floating point sliders lack obvious meaning:
- ”uh oh 101-value sliders”
- “these questions are also all true-false question - What does "67% hate speech" mean?”
- There are some missing label types
- “This instruction makes no sense. This instruction is ambiguous”
- requests “Low quality conversation” button
- Some labels don't make sense for some responses:
- "fails_task" doesn’t apply to user message
This could require re-thinking how the backend manages these labels and how the frontend presents them. Or it could just re-frame how the website presents the labels and answer types.
- Could use likert scale?
- We need to rethink what kind of labels are necessary and reduce the number of options, maybe show a different subset of labels for different users.
- we could also add a label / or some other metric for evaluating the complexity / amount of work put into writing the prompt (for detecting and encouraging high quality prompts)
- Think about the following:
- Duplicates
- Language
- Reward signal for users (thumbs up / down)
Agreed. We should:
- Reduce the set of labels to something small and meaningful
- Use a Likert scale for most of them
- Rewarding high effort posts effectively
Refs #872
can we close this?
Yes!