arctic-captions
arctic-captions copied to clipboard
question about grads of alphas in hard Attention
Hello, I feel realy confused about the grads of alphas in hard attention. The source code is in line 1199:
known_grads={alphas:opt_outs['masked_cost'][:,:,None]/10.* (alphas_sample/alphas) + alpha_entropy_c*(tensor.log(alphas) + 1)})
Can anyone explain this to me, please?
@denglixi I am also confused by this. Did you find the answer?
@denglixi @ysjakking Did you figure out the answer? I am also confused. Can anyone help me?
Me too...
@denglixi @ysjakking @shaoxuan92 @SijieSong Any solution did you get? I also come across this problem.