Bug in seg_output
After training the script prints some segmentation results. These are mostly empty and suggest that the network did not train properly (which is not the case). This is due to a bug in the seg_output definition which I fixed in this pull request. Cheers, Fabian
Thank you for the PR! There are two issues with this:
- Your PR contains a lot of unrelated commits. Can you rebase it onto the latest master? There should only be a single commit with your change. (Since you didn't create a branch for your PR, it should go something like
git fetch upstream,git rebase upstream/master,git push --force. Or maybe you can usegit pull --rebase upstreaminstead of the first two commands. If you get lost, let me know.) - When you pass
batch_norm_update_averages=Falsefor training, then you need to passbatch_norm_use_averages=Falsefor testing. But why do you disable updating the running averages in the first place? This means that at test time, it will batch-normalize using the examples in the test batch, instead of using the exponential running averages computed during training.
Hi Jan,
- sure I will do that. Sorry for the mess.
- I disabled it as a result of studying the code of FCDenseNet (https://github.com/SimJeg/FC-DenseNet) where disabling the running average update increased the segmentation quality because of avoiding stochasticity arising from small batch sizes (they train with batch size 3, then fine tune with batch size 1). I didn't really verify whether the segmentation improved in this example as well. If you wish I can remove it, but we might need to update the pretrained parameters in that case. Cheers, Fabian
then fine tune with batch size 1
If they fine-tune with batch size 1, then the network will behave well (and deterministically) when testing with batch size 1. If you train or test with batch sizes larger than 1, we should probably include the running averages, or use layer normalization instead of batch normalization (if you fear that a batch size of 8 is also too small). If we want to avoid retraining and updating the parameters, we should add a clear comment on what's done here and what should probably be done instead.