Bianry Segmentation for 3d ultrasound images doesn't work well
I used nnUNet to train on a private dataset of 3D ultrasound images for a binary segmentation task. I followed the instructions to construct data folder and the program run successfully. After training the prediction generated by the network are much lower than I expected(lower than simple Unet), and exists Null segments inside the valid data. I cannot understand where is the problem. Can you guide what is wrong here?
Since the data is private I can email you some samples. I would appreciate it if you have time to help me out.
Hi there, this is really really hard to say without looking at the data. the most likely problem is that something went wrong during the dataset conversion. Can you please copy the output of your nnUNet_plan_and_preprocess? I am looking for this section:
before: {'spacing': array([999., 1., 1.]), 'spacing_transposed': array([999., 1., 1.]), 'data.shape (data is transposed)': (3, 1, 1500, 1500)} after: {'spacing': array([999., 1., 1.]), 'data.shape (data is resampled)': (3, 1, 1500, 1500)}
Also please give some details on what your label values are. Are they consecutive integers with 0 being background?
Thank you for your prompt reply!
- before: {'spacing': array([1., 1., 1.]), 'spacing_transposed': array([1., 1., 1.]), 'data.shape (data is transposed)': (1, 51, 576, 764)} after: {'spacing': array([1.282432, 1.282432, 1.282432]), 'data.shape (data is resampled)': (1, 40, 449, 596)}
- The original label is non-binary,but I converted them into 2 classes(background/foreground) in my own code.
If I could send you an email, I'd like to share u some data samples.
Hi, you said that you are working with 2D data and yet your images are 3D (shape (1, 51, 576, 764)). How is that possible? Best, Fabian PS: you can send samples to [email protected]. Ideal would be the entire dataset so that I can reproduce the bad performance
Hi, My fault(#-.-), I was trying to say 3D ultrasound images. Sorry about the misunderstanding. I have re-edited the issue I have attached the whole dataset and my own data conversion code in email for your reference, the first-level directory of the original data folder represents the device number, and the second-level directory represents the patient id. I have trained a 2DResUNet with data from specific devices(specified in my code), the mean DICE was up to 0.68, while when I trained with nnUNet, the results seemed abnormal and relatively low. Hope you can give me some guidance.
Thank you very much for your time!
Best
Fabian Isensee @.***> 于2022年3月3日周四 20:11写道:
Hi, you said that you are working with 2D data and yet your images are 3D (shape (1, 51, 576, 764)). How is that possible? Best, Fabian
— Reply to this email directly, view it on GitHub https://github.com/MIC-DKFZ/nnUNet/issues/959#issuecomment-1057980431, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALTLJ4PDWUWA4QZ7OXME2PLU6CT47ANCNFSM5PZLGFOA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you authored the thread.Message ID: @.***>
I haven't gotten an email yet - it's probably stuck somewhere and will show up in my inbox soon. Have you used the same splits for training nnU-Net and your 2D resUNet? If not then this could be the culprit. If you used custom splits in nnU-Net, can you please send the splits_final.pkl file as well?
Just to let you know that I am currently running a training. In the future, please make sure the script you send me actually work ;-) There were a couple of syntax errors in there. From what I can see there is a large fluctuation in performance between folds, so whenever you are comparing two methods make sure they both use the same splits! The segmentation problem seems to be very very hard to solve. More once the trainings are finished
Thanks a lot, I'll send you the corresponding splits_final.pkl file later.
Best
Fabian Isensee @.***> 于2022年3月4日周五 16:39写道:
Just to let you know that I am currently running a training. In the future, please make sure the script you send me actually work ;-) There were a couple of syntax errors in there. From what I can see there is a large fluctuation in performance between folds, so whenever you are comparing two methods make sure they both use the same splits! The segmentation problem seems to be very very hard to solve. More once the trainings are finished
— Reply to this email directly, view it on GitHub https://github.com/MIC-DKFZ/nnUNet/issues/959#issuecomment-1058954298, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALTLJ4MZK46RDBLH2MLDIQLU6HD5RANCNFSM5PZLGFOA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you authored the thread.Message ID: @.***>
Hi, I can confirm that the Dice scores are really low for that dataset. From what I can see the segmentation task is just too difficult with too few cases for how difficult it is. If you have gotten better performance with your model then that's great! Given the lack of training cases I would probably also try to use a regular 2D model pretrained on natural images. This could help. The best solution would be to collect more training data though ;-) Best, Fabian
Hi, Fabian Thank u for your reply, which is quite useful to me. I have been struggling with this dataset for a long time. As u said I should start collecting more data next. By the way, I found out that this dataset was annotated by different annotators. I think the label noise caused by inter-observer variability may bother the model. Do u have any good suggestions? Best!
Fabian Isensee @.***> 于2022年3月7日周一 18:51写道:
Hi, I can confirm that the Dice scores are really low for that dataset. From what I can see the segmentation task is just too difficult with too few cases for how difficult it is. If you have gotten better performance with your model then that's great! Given the lack of training cases I would probably also try to use a regular 2D model pretrained on natural images. This could help. The best solution would be to collect more training data though ;-) Best, Fabian
— Reply to this email directly, view it on GitHub https://github.com/MIC-DKFZ/nnUNet/issues/959#issuecomment-1060500229, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALTLJ4MLLADKSALB47EG2D3U6XNUBANCNFSM5PZLGFOA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you authored the thread.Message ID: @.***>
There is a lot of overfitting and this could be a sign that high inter-rater variability is present. It is difficult for me to interpret the data because I would really be able recognize the target structures myself without consulting the segmentation. If you have evidence that inter-rater variability is a problem then tackling that might be a solution as well. For now though I think that collecting more data would be more time efficient