coralnet
coralnet copied to clipboard
'Reset classifiers' and 'force retrain' buttons for source admins
Recently, it seems that many source owners have caught on to the fact that adding/removing labelset labels can force retraining (by way of resetting the source's classifiers) when training gets stuck. So they'll actually add a dummy label and then remove the dummy label, just to make this happen.
Hopefully with PR #409, we've addressed the main reason for training getting stuck. However:
-
Sources don't retrain when annotations are just changed (not added). And they also don't retrain when annotations or images are deleted.
-
There have been many times when people just got confused by the classifier acceptance criteria, and thought it might've gotten stuck when it actually hadn't.
-
At times, people have been testing CoralNet by seeing what classifier accuracy they get with exactly N images, or a specific set of images. But that's difficult to test when the source only retrains on a 1.1x image gain, and doesn't show stats for newly trained classifiers that were rejected.
There are probably smarter solutions for all of these pain points, but I think 'reset classifiers' and 'force retrain' button controls would be simple and versatile band-aids that people would find useful. This would probably go best with the idea we had a while back, about separating source admin controls into more than one page (#92). But at minimum, these buttons could just be added to the existing source Admin page.
Obviously 'reset classifiers' needs to come with a warning that the source's classifier history will be erased. (Similar to the warning when the labelset's changed.) Whereas 'force retrain' would not erase previous classifiers, but also would not guarantee that the newly trained classifier will be accepted.
If we want to limit how often these buttons can be clicked, we can disable the buttons if the last classifier's create_date
was less than, say, 3 hours ago. Or maybe there are other avenues: detect when a classifier is already queued for training, detect when there are truly no training-relevant changes to the source (the last_annotation
fields could help), etc.