superintendent icon indicating copy to clipboard operation
superintendent copied to clipboard

Progress Bar proportional to actual labelling tasks rather than all data points

Open credelosa opened this issue 2 years ago • 4 comments

Good day @janfreyberg!

Great work on making superintendent. I love how useful it is.

Recently tried using widget.add_features (labelled data) and did some labelling. I noticed that the progress bar is based on the combined length of the unlabelled and labelled data. Was wondering if it would be possible to have an option where the progress bar is only based on how many you have left to manually label? It makes sense this way since it would be difficult to keep track of the progress if you had so much labelled data added using add_features and then so little to manually label because the progress bar would be almost filled up right away (e.g. 75 initially labelled data, 25 unlabelled data to be manually labelled.. this would mean the progress bar would be 75% filled right away). Not sure if this was supposed to be built this way or is it possible to have this option :)

Would love to hear from you soon. Thank you!

credelosa avatar Sep 20 '22 06:09 credelosa

Hi, thank you for the kind words :)

This is a good question. I think it makes sense to show progress only based on what needs to be labelled. I have actually been thinking about this in the context of allowing multiple labellers to label the same data point, or allowing the administrator to partition the labelling tasks between multiple labellers.

I will think a bit more about how this can be implemented as there is some nuance here. What if you add 100 data points, 75 labelled and 25 unlabelled, then label 10, close the labelling session, and come back the next day to label the remaining 15? Would you expect the progress bar to start at 0 again, or would you want it to show 10/25 ?

Thanks again for the suggestion.

janfreyberg avatar Sep 20 '22 08:09 janfreyberg

Hello! Thanks for the quick reply.

I will think a bit more about how this can be implemented as there is some nuance here. What if you add 100 data points, 75 labelled and 25 unlabelled, then label 10, close the labelling session, and come back the next day to label the remaining 15? Would you expect the progress bar to start at 0 again, or would you want it to show 10/25 ?

Good point. Personally, I think it should start at 0 again assuming we would stick to the progress bar being proportional to actual labelling tasks. In other words, I would consider today and tomorrow's labelling sessions to be separate sessions thus two independent progress bars (i.e. both would start at 0). Bottomline, progress bar should show how many items left you need to label for the current session.

credelosa avatar Sep 20 '22 09:09 credelosa

Makes sense. OK, thanks for the feedback. Once I've merged some other changes I am working on, I will start a branch for this.

janfreyberg avatar Sep 20 '22 09:09 janfreyberg

Great! Thanks for the swift replies and for being receptive to this suggestion. Looking forward to this update. 👍

credelosa avatar Sep 20 '22 09:09 credelosa