SGM icon indicating copy to clipboard operation
SGM copied to clipboard

请问在decode中前向传播的时候,为什么rnn的输入是标签向量和state,而没有经过attention后得到的c(t)呢,没看懂

Open HaimianYu opened this issue 6 years ago • 5 comments

您代码里面的解码的前向传播没怎么看懂 `< def forward(self, inputs, init_state, contexts): if not self.config.global_emb: embs = self.embedding(inputs) outputs, state, attns = [], init_state, [] for emb in embs.split(1): output, state = self.rnn(emb.squeeze(0), state) output, attn_weights = self.attention(output, contexts) output = self.dropout(output) outputs += [output] attns += [attn_weights] outputs = torch.stack(outputs) attns = torch.stack(attns) return outputs, state else: outputs, state, attns = [], init_state, [] embs = self.embedding(inputs).split(1) max_time_step = len(embs) emb = embs[0] output, state = self.rnn(emb.squeeze(0), state) output, attn_weights = self.attention(output, contexts) output = self.dropout(output) soft_score = F.softmax(self.linear(output)) outputs += [output] attns += [attn_weights]

        batch_size = soft_score.size(0)
        a, b = self.embedding.weight.size()

        for i in range(max_time_step-1):
            emb1 = torch.bmm(soft_score.unsqueeze(1), self.embedding.weight.expand((batch_size, a, b)))
            emb2 = embs[i+1]
            gamma = F.sigmoid(self.gated1(emb1.squeeze())+self.gated2(emb2.squeeze()))
            emb = gamma * emb1.squeeze() + (1 - gamma) * emb2.squeeze()
            output, state = self.rnn(emb, state)
            output, attn_weights = self.attention(output, contexts)
            output = self.dropout(output)
            soft_score = F.softmax(self.linear(output))
            outputs += [output]
            attns += [attn_weights]
        outputs = torch.stack(outputs)
        attns = torch.stack(attns)
        return outputs, state>`

HaimianYu avatar Jan 04 '19 13:01 HaimianYu

The same question!!

andiShan11 avatar Jun 12 '19 06:06 andiShan11

我也想问,感觉代码和论文不一样

Canadalynx avatar Jun 20 '19 10:06 Canadalynx

same question

v587su avatar Jul 05 '19 08:07 v587su

论文和代码不一致,实现的时候用的是Minh-Thang Luong 15EMNLP的paper的方法Effective Approaches to Attention-based Neural Machine Translation

wjczf123 avatar May 03 '20 04:05 wjczf123

论文和代码不一致,实现的时候用的是Minh-Thang Luong 15EMNLP的paper的方法Effective Approaches to Attention-based Neural Machine Translation

我看代码里注意力机制那里用的是Luong attention,但是decode这里也和这篇论文有关吗?

MaRuoxue avatar Mar 30 '21 14:03 MaRuoxue