open-reid
open-reid copied to clipboard
Why are you initialize the weights of fully connected layer with 0.001 std Gaussian distribution?
Dear author: I think your work is wonderful, and this repository really helps a lot in person reid! But I have a few question. Why are you initialize the weights of fully connected layer with 0.001 std Gaussian distribution? Since I make the experiments comparison between your 0.001 std Gaussian distribution initialization version and the default initialization version of pytorch fully-connected layer, it is amazing that such a simple trick can make almost 5 percent improvement of mAP. Do you have some theoretical guidelines for this trick or just concluded from experiments? I'm really interested in your work, hoping your reply!