pytorch-seq2seq
pytorch-seq2seq copied to clipboard
TopKDecoder
Hi,
I wonder if rnn.forward_step changes the order of (batch_size*self.k) dimension ?
With the code about initializing sequence_scores:
and in each step:
It seems like sequence_scores is updated as (assume that selk.k = 3):
If hidden and inflated_encoder_outputs should be calculated as follow?
inflated_encoder_outputs = _inflate(encoder_outputs,self.k,1).view(batch_size*self.k, -1)
hidden_shape = encoder_hidden.size()
hidden = _inflate(encoder_hidden, self.k, 2).view(hidden_shape[0], batch_size*self.k, hidden_shape[2])
hi, I am studying the code and have similar doubts. However, can you be clear what you mean by decoder_output? do you actually mean log_softmax_output?
@JojoFisherman Yeah, I mean the output probability of decoder, i.e. log_softmax_output.
I have the same question. It surprised me that no one has answered this. If theres really something wrong in the beam search, surely it will output some weird sequence. Do you have any conclusion about this?
It seems some issues have referred that beam search doesn't work correctly. Unfortunately, maybe this repo is not active maintained now. Currently, I use fairseq (pytorch version) to conduct some related experiments.
I studied the codes these days, and I thought you can use the torch.repeat_interleave. Such as follow:
hidden = tuple([torch.repeat_interleave(h, self.k, dim=1) for h in encoder_hidden])
inflated_encoder_outputs = torch.repeat_interleave(encoder_outputs, self.k, dim=0)
I studied the codes these days, and I thought you can use the torch.repeat_interleave. Such as follow:
hidden = tuple([torch.repeat_interleave(h, self.k, dim=1) for h in encoder_hidden])
inflated_encoder_outputs = torch.repeat_interleave(encoder_outputs, self.k, dim=0)
I had the problem with batch_size > 1, but after applying this comment, then it works now.
Thank you!!