pointer-generator how to project attention distribution to vocab distribution when both are tensors and not list ?

how to project attention distribution to vocab distribution when both are tensors and not list ?

Open rajarsheem opened this issue 7 years ago • 1 comments

Hi, In my case, attention distribution, the indices at the encoder side and the vocab distribution are all tensors. Their shapes are: attn_dist -> [batch size, num_decoder_steps, num_encoder_steps] contains the attn_dist over all encoder time steps for each decoder time step. indices -> [batch_size, num_encoder_steps] contains the word indices at each encoder time step. vocab_dist -> [batch_size, num_decoder_steps, vocab_size].

I need to project attn_dist to vocab_dist by using the indices to get the final_dist following the same logic you have used here. The major difference is that I can't unpack the tensors to a list as num_encoder_steps and num_decoder_steps are unknown in my case (inferred from batch). I am looking for the correct use of scatter_nd for my case. Can you please help me.

Aug 28 '17 13:08 rajarsheem

Also, if it doesn't work for the initial set up I described, I would still be happy to get a solution for the scenario where all dimensions are known !

Aug 28 '17 13:08 rajarsheem

pointer-generator pointer-generator copied to clipboard

how to project attention distribution to vocab distribution when both are tensors and not list ?

pointer-generator
pointer-generator copied to clipboard