visual-concepts
visual-concepts copied to clipboard
Why use -6.58 as the initialization of bias
https://github.com/s-gupta/visual-concepts/blob/0e223639df399dc973f8c7005ee68c228afc9784/output/v1/mil_finetune.prototxt#L184
Hi Saurabh, Recently, I am trying to export this code from Caffe to Pytorch. When tuning the model in Fully-Convolutional Network, I found a very interesting trick in the code. The bias of the classification layer should be set to -6.58, otherwise, the optimization will be misled. For example, If this value is initialized to zero, the model even does not converge. So I want to know why you use this value as initialization and how did you find it.
Did you export this code to pytorch?
Did you export this code to pytorch?
@mememimis, I have exported it to Pytorch. But still, have some troubles. I will release the code later, hope you can help to refine it together.
Did you export this code to pytorch?
@mememimis, I have exported it to Pytorch. But still, have some troubles. I will release the code later, hope you can help to refine it together.
Can you share your pytorch code? Thank you very much