Non-local_pytorch
Non-local_pytorch copied to clipboard
problem of function W initialization
I think you want to initialize self.W as zero, so that the residual path won't affect the pre-trained model, but I can not figure out why you initialize self.W[1] rather than self.W[0] when using bn layer?
Hi @djy-tsinghua, maybe you can find the answer in this issue https://github.com/AlexHex7/Non-local_pytorch/issues/1.
Thank you very much!@AlexHex7