argilla
argilla copied to clipboard
Improve label organization and interaction in Token Classification
Description
Creating a new UI mode that helps with annotation sessions.
This mode must be included for all defined tasks and should keep the user interaction as simple as possible and hard oriented to annotation.
From issue #1851
Describe the bug With more than 10 labels the button hot-links stop.
To Reproduce Go to token / text-classification and assign >10 label options.
Expected behavior I would expect them to continue on function keys or initial qwerty keys.
Screenshots
Some ideas I would like to add to this:
-
Would also be good to get more variation in colors. I know it is pretty hard to choose a distinctive enough color pallet but still important.
-
allowing custom key(and color) assignment. maybe something like:
rg.log(..., tags={"ORG":"organization","LOC":"location",...} ,config={"shortcuts":{"ORG":"1","LOC":"a",...}})
-
allow labels on the top bar to be clickable (so you can use the drop down list or the top bar to pick the label)(not so sure how useful this is)
-
A slightly more complex suggestion: group labels hierarchically. Sometimes there can be a logical grouping of labels. purely for example let's say:
can be divided like this (excuse the paint edit):
then for keyboard shortcuts you would first press the parent tag key, then the sub. example: [1][3] would be food, [2][5]organization
@dvsrepo Would you say this issue has been addressed properly in the SpanQuestion
included in v1.26.0?
SpanQuestion
looks nice 🥳 I'll try it out this week migrating the old TokenClassification dataset.
Here are my first impressions & feedback about the new SpanQuestion
.
I have migrated my old TokenClassification
dataset to a Feedback dataset:
@nataliaElv
- Using only numeric shortcuts is much more error prone, alpha-numeric is better imo. So instead of keyboard shortcuts going 10,11,12… I would rather they went to letters (qwerty…), because typing in quick succession is error prone. ie: you wanna hit 14, accidentally end up with 4 cause you were a bit slow that time maybe.
@dvsrepo Here is more general feedback about the SpanQuestion
(compared to the old dataset & for its better future)
Visual:
-
It kinda feels like the old UI looked a lot better. The main problem is most of the time labels don't fit (especially for non-english character level annotation that is more dense ) example:
- I wouldn’t mind a bigger font size on the text if it meant fitting the labels, this would make text selection easier too.
- Also for readability just a tiny bit more margin space between text and labels looks better imo (from tinkering in the css)(maybe even make sizing stuff configureable? I know not easy as said 😶🌫️ )
-
Representing score as transparency value might be a good idea? I don’t really see the point of tiny stars emoji taking more screen space. That being said I liked the annotations on top predictions at the bottom style of the old datasets UI. because it made it clear where there were differences. Especially when you are fixing a model
-
The searched keyword is no longer colored red in the results (I know it sounds nitpicky but really makes a difference when you are looking at a block of text)
Functional:
- I would hope Span annotation to be available in Bulk view in future versions 🥺 🙏 Because I use that a lot for going over the annotations in one look. (Usually combined with search) like I know a certain word/phrase has been mislabeled I do a search and check/correct quickly. It would be icing on the cake if I can search and do a selection and apply a label then apply that same thing(exact search based) to the other records in the bulk selection. (At least prioritizing a bulk view 🤞 ) Lack of bulk (like the old dataset) really hinders other feature's usefullness like the similarity search.
- [Low priority] If a word has a label for example "FRUITS" then I just select
S
and give it a new label it shouldn’t remove the existing label from the remainder of the word "FRUIT".
In the future my hope is to combine multiple tasks (span &text classification & etc) into one record in a feedback dataset. I haven't played around much with the actual annotation part yet so may have more feedback soon 🙈
Thanks @cceyda for the detailed feedback! We'll take notes on this 😃 cc @Amelie-V