How does nnUNet performance scale to having more training data

Open charlesmoatti opened this issue 3 years ago • 0 comments

Just a small remark/question I would like to make given that I did not find it in the previous issues. This is what I have observed so far working with nnUNet.

Giving more data to training of a nnUNet model e.g. go from 80 3D training images to 800 should not have much impact given the strategy of a fixed number of batches per epoch (250) and the multitude of data augmentation techniques used. The fact that we do not ensure that nnUNet is exposed to all images at training time but perform a random sampling for each patch reinforces this thought for me. I think that the performance plateau of nnUNet is quickly reached with a small number of samples (let's say in the ~50 3d images range, depending on the exact task) and that past this point adding more data is useless (for performance metrics such as dice score at least).

Please correct me if I am wrong and any thoughts are welcome.

Any thoughts on how nn-UNet performance/robustness would scale to having lot more data in training? Is the plateau of nnUNet performance quickly reached with relatively few training data?

Sep 20 '22 13:09 charlesmoatti