Recipes icon indicating copy to clipboard operation
Recipes copied to clipboard

Bug in seg_output

Open FabianIsensee opened this issue 8 years ago • 3 comments

After training the script prints some segmentation results. These are mostly empty and suggest that the network did not train properly (which is not the case). This is due to a bug in the seg_output definition which I fixed in this pull request. Cheers, Fabian

FabianIsensee avatar Mar 17 '17 17:03 FabianIsensee

Thank you for the PR! There are two issues with this:

  1. Your PR contains a lot of unrelated commits. Can you rebase it onto the latest master? There should only be a single commit with your change. (Since you didn't create a branch for your PR, it should go something like git fetch upstream, git rebase upstream/master, git push --force. Or maybe you can use git pull --rebase upstream instead of the first two commands. If you get lost, let me know.)
  2. When you pass batch_norm_update_averages=False for training, then you need to pass batch_norm_use_averages=False for testing. But why do you disable updating the running averages in the first place? This means that at test time, it will batch-normalize using the examples in the test batch, instead of using the exponential running averages computed during training.

f0k avatar Mar 17 '17 18:03 f0k

Hi Jan,

  1. sure I will do that. Sorry for the mess.
  2. I disabled it as a result of studying the code of FCDenseNet (https://github.com/SimJeg/FC-DenseNet) where disabling the running average update increased the segmentation quality because of avoiding stochasticity arising from small batch sizes (they train with batch size 3, then fine tune with batch size 1). I didn't really verify whether the segmentation improved in this example as well. If you wish I can remove it, but we might need to update the pretrained parameters in that case. Cheers, Fabian

FabianIsensee avatar Mar 28 '17 12:03 FabianIsensee

then fine tune with batch size 1

If they fine-tune with batch size 1, then the network will behave well (and deterministically) when testing with batch size 1. If you train or test with batch sizes larger than 1, we should probably include the running averages, or use layer normalization instead of batch normalization (if you fear that a batch size of 8 is also too small). If we want to avoid retraining and updating the parameters, we should add a clear comment on what's done here and what should probably be done instead.

f0k avatar Apr 04 '17 18:04 f0k