OpenNRE-PyTorch icon indicating copy to clipboard operation
OpenNRE-PyTorch copied to clipboard

关于双向LSTM的实现问题

Open Anery opened this issue 6 years ago • 0 comments

您好!我写了一个BiLSTM encoder: image 不知道图片有没有正常显示,代码如下:

  • class BiRNN(nn.Module):

  • def __init__(self, config):
        super(BiRNN, self).__init__()
       self.config = config
       self.sen_len = None
       self.in_width = self.config.word_size + 2 * self.config.pos_size
       self.rnn = nn.GRU(self.in_width, self.config.hidden_size, bidirectional=True, batch_first=True)
       self._init_lstm()
    
  • def _init_lstm(self):
       torch.nn.init.xavier_normal(self.rnn.all_weights[0][0])
       torch.nn.init.xavier_normal(self.rnn.all_weights[0][1])
       torch.nn.init.xavier_normal(self.rnn.all_weights[1][0])
       torch.nn.init.xavier_normal(self.rnn.all_weights[1][1])
    
  • def forward(self, embedding):
       # sort by sentence length
       sen_len, perm_idx = self.sen_len.sort(0, descending=True)
       _, un_idx = torch.sort(perm_idx, dim=0)
       sen_tensor = embedding[perm_idx]
    
       packed_input = pack_padded_sequence(sen_tensor, sen_len.cpu().numpy(), batch_first=True)
       pack_out, _ = self.rnn(packed_input)
       # out: len, batch, in_width * 2
       output, _ = pad_packed_sequence(pack_out, batch_first=True, total_length=self.config.max_length)
       out = torch.index_select(output, 0, un_idx)
       return out[:,-1,:]
    

但训练时结果很奇怪,第二个epoch开始loss就不再降了,NA acc=1,not NA acc=0,不再变动 image

我打印了LSTM的输出发现它们都非常接近于0,我想知道为什么会这样?是我的LSTM代码写的有问题吗(实在没找到哪里写错了)?或者我该怎么调整参数?这个结果和OpenNRE里birnn模型差得太多了,我不知道为什么

这个问题困扰了我几天了,非常希望得到一些建议,感谢!

Anery avatar Aug 23 '19 02:08 Anery