MIScnn
MIScnn copied to clipboard
Soft or Hard ground-truth masks?
Dear MIScnn team,
I was wondering: When training or testing a segmentation model: do you binarize the ground-truth masks, that have been transformed by preprocessing / data augmentation operations, just before feeding the network? Is there a parameter that controls this? If so, what are your thoughts on the best practice?
Cheers
Hey @charleygros,
thanks for this interesting question.
I personally never worked on a medical imaging data set with soft masks, but obviously there could be scenarios in which it make sense to utilize soft masks.
Currently MIScnn has the following workflow:
- Running preprocessing functions (in MIScnn called subfunctions)
- Performing the binarization of ground-truth masks via One-Hot-Encoding (to_categorical function of Keras)
- Run data augmentation functions
- Feed it to the network
I am thinking about this problem about a few minutes, now, and I guess this feature should be quite easily integrated into MIScnn.
If I implement a functionality to disable one-hot-encoding, you should be able to successfully feed it to the network and run the pipeline. MIScnn already provide a function to return direct softmax outputs instead of the most probable class, therefore you also get a soft prediction.
Sadly, I have one problematic scenario in mind: Data Augmentation. MIScnn integrated the batchgenerators package from the Division of Medical Image Computing at the German Cancer Research Center. If running e.g. image scaling, we have to be aware that the scaling of the segmentation soft mask is not perfectly correlated to the image scaling. I have the feeling, it makes sense to teach the model to assess intensity value differences. Augmentations which influences these intensity values should theoretically also impact the spft mask segmentation values, which is not the case in conventional data augmentation for hard masks.
But if you wanna try this out, I will implement a parameter for deactivating the one-hot-encoding.
Do you have a good data set in mind with soft masks or do you have access on private medical imaging data?
Cheers, Dominik
Many thanks for your very prompt answer and very helful insights @muellerdo ! Really appreciate it!
Yes, it would be to: (i) do not apply thresholding on masks between your steps 3 and 4 and (ii) use linear interpolation in your steps 1 and 3 (instead of NearestNeighbors) and as you said (iii) allow to disable your step 2.
I really don't have the answer to this question... I am just being curious! Because I feel that soft masks are carrying level of confidence that would be important to consider / feed to the network. And nearest-neighbors are source of approximations / errors. But at the same time... at the end of the day... we want to achieve hard segmentations... so... It would depend on the application I guess.
MIScnn already provide a function to return direct softmax outputs instead of the most probable class, therefore you also get a soft prediction.
Very good to know, thanks for pointing this out!
Sadly, I don't have much time right now to test this, sorry about that. But I was curious of your opinion and the best practice you would recommend re this question of Hard vs Soft training.