[FEATURE] Checkpointing
Hello,
I was confused to see that the default training pipeline does not save state_dicts at lowest validation loss? Also I am curious why there is no validation splitting? There is "test data" (=validation data?) which you can input but I dont see where it is used to checkpoint?
Kind Regards Daniel
Currently in the train.train_seg() function you can save a model every save_every epochs or only at the end. Is there a reference that says that the lowest validation loss is saved?
I was not saying the paper claimed to have checkpointing based on validation loss, while the code doesnt. I was just expecting it to have this feature, as it is usually common in Deep learning to checkpoint models based on validation. Perhaps there is a reason we dont do it in cellpose, maybe it is assumed the network can only improve by training, which i doubt, after all we are optimizing with adamw
While checkpointing based on the validation loss isn't built into cellpose, you can save the model at every validation step. We leave it up to the user to choose to use these checkpointed models or the final one