ndsb icon indicating copy to clipboard operation
ndsb copied to clipboard

BN inference

Open ChenglongChen opened this issue 9 years ago • 1 comments

Nice work! I am also interested in the BN inference part. To my understand of the paper, algo2 should be something as follows (https://github.com/BVLC/caffe/pull/1965#issuecomment-82956030):

  1. before the TEST phase, we forward a few mini-batch to compute the mean & var for the 1st BN layer, then we save this mean & var for other round inference (& forward)
  2. we then forward those mini-batch to compute the mean & var for the 2nd BN layer, notice that the normalization part of the 1st BN layer is carried out using mean & var computed in step1 not the mini-batch statistics.
  3. similarly, we perform the above for the rest BN layers.
  4. after computing all the mean & var, we then have the inference BN network.

So I think in the following lines, you should switch the for-loop order (for k; then for iter;) and use (k-1)-th BN's population mean & var for computing k-th BN's mean & var. https://github.com/lim0606/ndsb/blob/master/codes_for_caffe/predict_bn.cpp#L295-L302

What's your opinion?

ChenglongChen avatar Mar 18 '15 16:03 ChenglongChen

Thank you for sharing your bn_layer codes and comments :)

I also agree with your opinion that estimating means and variances layer by layer from the bottom layer is more natural way (similar to greedy layer-wise pre-training of DBN).

When I was implement predict_pn.cpp, however, I thought that the means and variances would be converge to the similar value if I estimate them with enough number of training examples. It seems the reason that the original paper only briefly described the estimation of means and variances in inference.

I will compare the results of two inference! :)

lim0606 avatar Mar 18 '15 23:03 lim0606