nnUNet
nnUNet copied to clipboard
About Cross Validation
@FabianIsensee Thanks for your great work! It helps me a lot!
My dataset is so small, which only has 130 patients in MRI modality. Because of the small dataset, I want to use 5-fold cross-validation to evaluate the model (no testing dataset ). In nnunet default setting, 5-fold cross-validation is used to select the best hyperparameter for testing dataset. However, for me, there is no testing dataset, how should i do a correct 5-fold cross-validation?
https://github.com/MIC-DKFZ/nnUNet/issues/695#issuecomment-1011331359, you mention that "model_best.model uses an exponential moving average of the green line to determine what is considered 'best'. This is not very well tunes because I don't use it and it should also never be used to report 5-fold cross-validation scores (overfitting to validation set)."
So a correct 5-fold cross-validation should pick model_final_checkpoint.model?
My questions are as follows:
- For a small dataset, not set testing dataset, just use 5-fold cross-validation to evaluate model, is that correct? Otherwise, should i set traning set and testing set on my 130 dataset like a big dataset?
- pick model_final_checkpoint.model to do validation is necessary for a proper cross-validation? If i use model_best to do validation, what impact will it bring?
Thanks!
Hi,
I think there is misunderstanding if I understood cross validation in nnU-Net correctly.
nnUNet assume that there is no testing set, as same as your condition.
nnUNet do cross-validation 5 folding for training/validation with dataset. For example, with 130 patients, 5x (104, 26) foldings.
I don't think you have extremely small dataset which prevent to keep testset to check your model. In my case, I have 16 patients and keep 4 patients for testset, and use only 12 patients for 5 folding cross-validation.
I think nnUNet uses model_best, rather than model_final_checkpoint although I need to double-check with the code..
Best
@Joeycho Dear Joeycho
Thanks for your reply! I still have some problems with the selected model for validation ( final model or best model).
when i set '--validation_only' and '--valbest' to 'True', the code will go below line which explicitly point out using best_model. https://github.com/MIC-DKFZ/nnUNet/blob/aa53b3b87130ad78f0a28e6169a83215d708d659/nnunet/run/run_training.py#L182
but in the default training flow of nnunet, i could not find that some code about loading best_model.
Best
Hi @peeelS,
I hope it was helpful to you.
I think it makes sense to me, loading final model in training flow.
only in validation and for test purpose, load the best one, and see the result.
in training, even though the result is worse, we want to load the lastest model, to keep update parameters. Otherwise, training might be stuck in the very early phase with the best model.
Hello, sorry for the late response. Its this question still relevant? Otherwise, I would close the issue. Cheers Ole