pix2pixHD icon indicating copy to clipboard operation
pix2pixHD copied to clipboard

Batchnorm is not in eval mode

Open Feng137 opened this issue 5 years ago • 5 comments

When using batchnorm , in pix2pixHD_model.py, line 216 of inference function. I think we should use eval mode. But now it use batchnorm in a train mode.

Feng137 avatar Jan 16 '20 03:01 Feng137

does anyone know why they don't use .eval()? is it a bug or is it by design?

ashual avatar Jul 13 '20 17:07 ashual

does anyone know why they don't use .eval()? is it a bug or is it by design?

I think it is a bug.

In a inference, if the batchnorm is used in a train mode, the bahavior is the same as instancenorm. So the batchnorm is always used as instancenorm in current code.

In a fully trained model, the mean and the var saved by batchnorm is nearly equal to the mean and the var of instancenorm which is calculated in real time. In this case, whether the batchnorm is used in the train mode or the eval() mode is not significant. That's maybe the reason they haven't found the bug yet.

But in a not fully trained model, the result would be significantly different.

[just personal opinion.]

Feng137 avatar Jul 14 '20 02:07 Feng137

It's not a bug. In the original pix2pix paper authors state the following:

At inference time, we run the generator net in exactly the same manner as during the training phase. This differs from the usual protocol in that we apply dropout at test time, and we apply batch normalization [29] using the statistics of the test batch, rather than aggregated statistics of the training batch. This approach to batch normalization, when the batch size is set to 1, has been termed “instance normalization” and has been demonstrated to be effective at image generation tasks [54].

Same applies for pix2pixHD.

kondela avatar Aug 11 '20 14:08 kondela

does anyone know why they don't use .eval()? is it a bug or is it by design?

I think it is a bug.

In a inference, if the batchnorm is used in a train mode, the bahavior is the same as instancenorm. So the batchnorm is always used as instancenorm in current code.

In a fully trained model, the mean and the var saved by batchnorm is nearly equal to the mean and the var of instancenorm which is calculated in real time. In this case, whether the batchnorm is used in the train mode or the eval() mode is not significant. That's maybe the reason they haven't found the bug yet.

But in a not fully trained model, the result would be significantly different.

[just personal opinion.]

same opinion: The bahavior is the same as instancenorm. 👍

Feng137 avatar Dec 21 '20 03:12 Feng137

It's not a bug. In the original pix2pix paper authors state the following:

At inference time, we run the generator net in exactly the same manner as during the training phase. This differs from the usual protocol in that we apply dropout at test time, and we apply batch normalization [29] using the statistics of the test batch, rather than aggregated statistics of the training batch. This approach to batch normalization, when the batch size is set to 1, has been termed “instance normalization” and has been demonstrated to be effective at image generation tasks [54].

Same applies for pix2pixHD.

👍

Feng137 avatar Dec 21 '20 03:12 Feng137