Universal-Transformer-Pytorch icon indicating copy to clipboard operation
Universal-Transformer-Pytorch copied to clipboard

probability exceed threshold at step 2 from second epoch onwards

Open zyzpower opened this issue 6 years ago • 0 comments

hi,

when I run the model, I realize at first epoch it can reach max step 24, but start from second or third epoch, the probability by "p = self.sigma(self.p(state)).squeeze(-1)" become very near to threshold and it will exceed at step 2. So my encoder layer become only has 2 layer. Any idea why?

zyzpower avatar Oct 11 '19 08:10 zyzpower