what's the difference between GBN & BN used in framework?

Open yljylj opened this issue 7 years ago • 2 comments

I've read your paper. But I don't understand the difference between GBN & BN used in framework. In my understanding, GBN does BN with local data. For distributed frameworks, they also only do BN with local data. So can you explain it please?

May 25 '18 03:05 yljylj

From what I understood in the paper, they are the same thing. In GBN, you artificially "isolate" parts of the batch when computing the values as if they were on distributed machines, even if you are training on a single system.

Jul 21 '19 21:07 Moxinilian

@Moxinilian you're right. If you're interested in more efficient implementation you could check TF BatchNorm + virtual_batch_size param. They reshape the input and then batch norm it inside the BN layer instead of making separate passes for each mini-batch.

Jul 24 '19 06:07 bonlime