Diff-UNet icon indicating copy to clipboard operation
Diff-UNet copied to clipboard

different implementation during testing?

Open jessie-chen99 opened this issue 1 year ago • 6 comments

Thanks for your great work but I have met a little problem:

Why is the implementation in this line different from the formula in the paper? https://github.com/ge-xing/Diff-UNet/blob/26990018c52b60a57a1ee8ebfb3e807897af1a1a/BraTS2020/test.py#L93 image

Should I change sample_outputs[i]["all_samples"][index].cpu() to uncer_out in https://github.com/ge-xing/Diff-UNet/blob/26990018c52b60a57a1ee8ebfb3e807897af1a1a/BraTS2020/test.py#L87

jessie-chen99 avatar May 04 '23 14:05 jessie-chen99

No.

image

As the figure shown, the sum of "sample_outputs[i]["all_samples"][index].cpu()" is equivalent to the \bar{p_i}.

920232796 avatar May 05 '23 01:05 920232796

No.

image

As the figure shown, the sum of "sample_outputs[i]["all_samples"][index].cpu()" is equivalent to the \bar{p_i}.

😊Thanks a lot for your reply! But I am still confused about this part.

image

As your code shows, the \bar{p_i} in formula Y = ∑ w_{i} × \bar{p_i} is implemented as "sample_outputs[i]["all_samples"][index].cpu()" instead of "the sum of sample_outputs[i]["all_samples"][index].cpu()" you just said.

According to these nested for-loops, I reckon the formula of Y = ∑ w_{i} × \bar{p_i} may be rewritten as below, which is different from the paper.

image So I am confused about which version is the correct version.
  1. In your implementation, ["all_model_outputs"][index] is x_0 (also called x_start) without function process_xstart(), ie. the output of diffusion Unet model, while the ["all_samples"][index] is x_0 after process_xstart(). They have different value ranges. ["all_samples"][index] range from [-1,1] but ["all_model_outputs"][index] range from [-17,18] for example. why need to use both ["all_model_outputs"] and ["all_samples"]?

Hope you can help me, please.

jessie-chen99 avatar May 05 '23 04:05 jessie-chen99

Uncertainty can not be calculated by the ["all_samples"], because it has been limited to [-1, 1].

920232796 avatar May 05 '23 07:05 920232796

Uncertainty can not be calculated by the ["all_samples"], because it has been limited to [-1, 1].

Thank you for your explanation, I understand now!

As for my first question, yesterday I tested both your official code version (in the first row) and your official paper version (in the second row, I just changed return sample_return to return sample_return/uncer_step).

btw, the training settings are the same:

  • dataset: BraTS2020 (5-fold)
  • input_size==96
  • batch_size per gpu==2
  • num_of _gpu==4
  • traing_epochs==300

It seems that the dice score in the second row seems better. It seems that this change may be helpful. Perhaps you could consider making this change as well, but it's all up to you.

image

If I made any mistake in the above analysis, please let me know. Thanks again!

jessie-chen99 avatar May 06 '23 02:05 jessie-chen99

Wow, thank you. I will modify this section.

920232796 avatar May 06 '23 02:05 920232796

Uncertainty can not be calculated by the ["all_samples"], because it has been limited to [-1, 1].

Thank you for your explanation, I understand now!

As for my first question, yesterday I tested both your official code version (in the first row) and your official paper version (in the second row, I just changed return sample_return to return sample_return/uncer_step).

btw, the training settings are the same:

  • dataset: BraTS2020 (5-fold)
  • input_size==96
  • batch_size per gpu==2
  • num_of _gpu==4
  • traing_epochs==300

It seems that the dice score in the second row seems better. It seems that this change may be helpful. Perhaps you could consider making this change as well, but it's all up to you.

image

If I made any mistake in the above analysis, please let me know. Thanks again!

Hi Jessie, Thanks for your valuable comments. I also run the training code on BraTS 2020 dataset, but my training results are quite strange which are: wt is 0.8498128652572632 tc is 0.4872548282146454 et is 0.41504526138305664 mean_dice is 0.5840376615524292

This is the final result showing in the log after 300 epochs (validation results), and my settings are: env = "DDP" max_epoch = 300 batch_size = 2 num_gpus = 4 GPU type: A100

I think my settings are similar to yours since I do not change anything and keep it as default. Thus I suspect the reason could be the different versions of packages. Could you please show your package version here if you don't mind? Here is my package version: Python 3.8.10 monai 1.1.0 numpy 1.22.2 SimpleITK 2.2.1 torch 1.13.0a0+936e930 My results are also similar with https://github.com/ge-xing/Diff-UNet/issues/18#issue-1739893968

gary-wang55 avatar Jun 19 '23 04:06 gary-wang55