pointer-generator
pointer-generator copied to clipboard
how to project attention distribution to vocab distribution when both are tensors and not list ?
Hi,
In my case, attention distribution, the indices at the encoder side and the vocab distribution are all tensors.
Their shapes are:
attn_dist
-> [batch size, num_decoder_steps, num_encoder_steps]
contains the attn_dist over all encoder time steps for each decoder time step.
indices
-> [batch_size, num_encoder_steps]
contains the word indices at each encoder time step.
vocab_dist
-> [batch_size, num_decoder_steps, vocab_size]
.
I need to project attn_dist
to vocab_dist
by using the indices
to get the final_dist
following the same logic you have used here. The major difference is that I can't unpack the tensors to a list as num_encoder_steps
and num_decoder_steps
are unknown in my case (inferred from batch).
I am looking for the correct use of scatter_nd for my case. Can you please help me.
Also, if it doesn't work for the initial set up I described, I would still be happy to get a solution for the scenario where all dimensions are known !