pix2pixHD
pix2pixHD copied to clipboard
Batchnorm is not in eval mode
When using batchnorm , in pix2pixHD_model.py, line 216 of inference function. I think we should use eval mode. But now it use batchnorm in a train mode.
does anyone know why they don't use .eval()? is it a bug or is it by design?
does anyone know why they don't use .eval()? is it a bug or is it by design?
I think it is a bug.
In a inference, if the batchnorm is used in a train mode, the bahavior is the same as instancenorm. So the batchnorm is always used as instancenorm in current code.
In a fully trained model, the mean and the var saved by batchnorm is nearly equal to the mean and the var of instancenorm which is calculated in real time. In this case, whether the batchnorm is used in the train mode or the eval() mode is not significant. That's maybe the reason they haven't found the bug yet.
But in a not fully trained model, the result would be significantly different.
[just personal opinion.]
It's not a bug. In the original pix2pix paper authors state the following:
At inference time, we run the generator net in exactly the same manner as during the training phase. This differs from the usual protocol in that we apply dropout at test time, and we apply batch normalization [29] using the statistics of the test batch, rather than aggregated statistics of the training batch. This approach to batch normalization, when the batch size is set to 1, has been termed “instance normalization” and has been demonstrated to be effective at image generation tasks [54].
Same applies for pix2pixHD.
does anyone know why they don't use .eval()? is it a bug or is it by design?
I think it is a bug.
In a inference, if the batchnorm is used in a train mode, the bahavior is the same as instancenorm. So the batchnorm is always used as instancenorm in current code.
In a fully trained model, the mean and the var saved by batchnorm is nearly equal to the mean and the var of instancenorm which is calculated in real time. In this case, whether the batchnorm is used in the train mode or the eval() mode is not significant. That's maybe the reason they haven't found the bug yet.
But in a not fully trained model, the result would be significantly different.
[just personal opinion.]
same opinion: The bahavior is the same as instancenorm. 👍
It's not a bug. In the original pix2pix paper authors state the following:
At inference time, we run the generator net in exactly the same manner as during the training phase. This differs from the usual protocol in that we apply dropout at test time, and we apply batch normalization [29] using the statistics of the test batch, rather than aggregated statistics of the training batch. This approach to batch normalization, when the batch size is set to 1, has been termed “instance normalization” and has been demonstrated to be effective at image generation tasks [54].
Same applies for pix2pixHD.
👍