pytorch-seq2seq icon indicating copy to clipboard operation
pytorch-seq2seq copied to clipboard

TopKDecoder

Open Hongzl1996 opened this issue 6 years ago • 6 comments

Hi, I wonder if rnn.forward_step changes the order of (batch_size*self.k) dimension ?
With the code about initializing sequence_scores: 2 and in each step: 3

It seems like sequence_scores is updated as (assume that selk.k = 3): default

If hidden and inflated_encoder_outputs should be calculated as follow?

inflated_encoder_outputs =  _inflate(encoder_outputs,self.k,1).view(batch_size*self.k, -1)  
hidden_shape = encoder_hidden.size()
hidden = _inflate(encoder_hidden, self.k, 2).view(hidden_shape[0], batch_size*self.k, hidden_shape[2])

Hongzl1996 avatar Nov 05 '18 12:11 Hongzl1996

hi, I am studying the code and have similar doubts. However, can you be clear what you mean by decoder_output? do you actually mean log_softmax_output?

KwanWaiChung avatar May 29 '19 15:05 KwanWaiChung

@JojoFisherman Yeah, I mean the output probability of decoder, i.e. log_softmax_output.

Hongzl1996 avatar May 30 '19 03:05 Hongzl1996

I have the same question. It surprised me that no one has answered this. If theres really something wrong in the beam search, surely it will output some weird sequence. Do you have any conclusion about this?

KwanWaiChung avatar May 30 '19 06:05 KwanWaiChung

It seems some issues have referred that beam search doesn't work correctly. Unfortunately, maybe this repo is not active maintained now. Currently, I use fairseq (pytorch version) to conduct some related experiments.

Hongzl1996 avatar May 30 '19 06:05 Hongzl1996

I studied the codes these days, and I thought you can use the torch.repeat_interleave. Such as follow: hidden = tuple([torch.repeat_interleave(h, self.k, dim=1) for h in encoder_hidden]) inflated_encoder_outputs = torch.repeat_interleave(encoder_outputs, self.k, dim=0)

GZJAS avatar Jun 11 '19 14:06 GZJAS

I studied the codes these days, and I thought you can use the torch.repeat_interleave. Such as follow: hidden = tuple([torch.repeat_interleave(h, self.k, dim=1) for h in encoder_hidden]) inflated_encoder_outputs = torch.repeat_interleave(encoder_outputs, self.k, dim=0)

I had the problem with batch_size > 1, but after applying this comment, then it works now.

Thank you!!

muncok avatar Feb 21 '20 08:02 muncok