ndsb
ndsb copied to clipboard
BN inference
Nice work! I am also interested in the BN inference part. To my understand of the paper, algo2 should be something as follows (https://github.com/BVLC/caffe/pull/1965#issuecomment-82956030):
- before the TEST phase, we forward a few mini-batch to compute the mean & var for the 1st BN layer, then we save this mean & var for other round inference (& forward)
- we then forward those mini-batch to compute the mean & var for the 2nd BN layer, notice that the normalization part of the 1st BN layer is carried out using mean & var computed in step1 not the mini-batch statistics.
- similarly, we perform the above for the rest BN layers.
- after computing all the mean & var, we then have the inference BN network.
So I think in the following lines, you should switch the for-loop order (for k; then for iter;) and use (k-1)-th BN's population mean & var for computing k-th BN's mean & var. https://github.com/lim0606/ndsb/blob/master/codes_for_caffe/predict_bn.cpp#L295-L302
What's your opinion?
Thank you for sharing your bn_layer codes and comments :)
I also agree with your opinion that estimating means and variances layer by layer from the bottom layer is more natural way (similar to greedy layer-wise pre-training of DBN).
When I was implement predict_pn.cpp, however, I thought that the means and variances would be converge to the similar value if I estimate them with enough number of training examples. It seems the reason that the original paper only briefly described the estimation of means and variances in inference.
I will compare the results of two inference! :)