GradCache icon indicating copy to clipboard operation
GradCache copied to clipboard

How to handle BatchNorm ?

Open heleifz opened this issue 3 years ago • 1 comments

BatchNorm is very common in CV models, when training = True, the running statistics in BatchNorm layers is changing in every chunk.

heleifz avatar Sep 30 '22 07:09 heleifz

There are no other ways unless you replace BN with GN. I sightly think about the problem. The inconsistent mainly appends in forward process. In each sub iteration, we can not get the statistics right. Because we can't get the next iteration‘s statistics. So we could not estimate statistics for the population. I think we can cache output after BN layer. Like syncBN, we do synchronous. But it carrys a big cost. If you have time, I would be pleasure to exchange views.

zzk2021 avatar Jan 26 '23 07:01 zzk2021