visual-concepts Why use -6.58 as the initialization of bias

Why use -6.58 as the initialization of bias

Open Beanocean opened this issue 6 years ago • 3 comments

https://github.com/s-gupta/visual-concepts/blob/0e223639df399dc973f8c7005ee68c228afc9784/output/v1/mil_finetune.prototxt#L184

Hi Saurabh, Recently, I am trying to export this code from Caffe to Pytorch. When tuning the model in Fully-Convolutional Network, I found a very interesting trick in the code. The bias of the classification layer should be set to -6.58, otherwise, the optimization will be misled. For example, If this value is initialized to zero, the model even does not converge. So I want to know why you use this value as initialization and how did you find it.

Dec 14 '18 07:12 Beanocean

Did you export this code to pytorch?

Mar 21 '19 16:03 mememimis

Did you export this code to pytorch?

@mememimis, I have exported it to Pytorch. But still, have some troubles. I will release the code later, hope you can help to refine it together.

Mar 25 '19 04:03 Beanocean

Did you export this code to pytorch?

@mememimis, I have exported it to Pytorch. But still, have some troubles. I will release the code later, hope you can help to refine it together.

Can you share your pytorch code? Thank you very much

Mar 02 '20 09:03 lifeGWT

visual-concepts visual-concepts copied to clipboard

Why use -6.58 as the initialization of bias

visual-concepts
visual-concepts copied to clipboard