dssm
dssm copied to clipboard
why my train loss ,after 4or5 epoch ,softmax value equal nan。
why my train loss ,after 4or5 epoch ,softmax value equal nan。
I found that, the function for calculating cosine_sim doesn't use exp (which was mentioned in original paper)
exp is used for softmax