a-PyTorch-Tutorial-to-Image-Captioning icon indicating copy to clipboard operation
a-PyTorch-Tutorial-to-Image-Captioning copied to clipboard

init_embedding

Open zplovekq opened this issue 5 years ago • 1 comments

bias = np.sqrt(3.0 / embeddings.size(1))
torch.nn.init.uniform_(embeddings, -bias, bias)

and the pytorch init is init.normal_(self.weight) why to do this and what is the refer? look forward for discuss

zplovekq avatar Jan 10 '20 09:01 zplovekq

bias = np.sqrt(3.0 / embeddings.size(1))
torch.nn.init.uniform_(embeddings, -bias, bias)

This is lecun_uniform way of initializing; here, the fan_in is the units of size emb_dim which is obtained using embeddins.size(1) as in the code.

The code samples (i.e. picks) value uniformly in the interval (-bias, +bias) where bias is defined as in the code sqrt(3.0 / emb_dim)

and the pytorch init is init.normal_(self.weight) why to do this and what is the refer?

Well, there is a whole area of research about why some initializations are better when compared to just initializing by sampling values from a simple uniform or gaussian distribution. Some initializations are found to be empirically better such as lecun_uniform.

Here is one c.f.: initializers/lecun_uniform

kmario23 avatar Jan 10 '20 17:01 kmario23