Danqing Wang
Danqing Wang
Please check your hyper-parameters. It seems that you have changed several **hidden size** and can not load the pretrained model. If you want to load the pretrained model directly, you...
Did you use the released checkpoint and set -m to 9 for multi-news datasets? Or could you test your ROUGE installation by using the released multi-news outputs to calculate the...
> Yes, I get a ROUGE score on the published output and a 6% difference on the multipurpose news dataset from the data listed by the author What does "_multipurpose...
Could you recheck your data preprocessing? Since you have no problem with CNN/DM, there may be something wrong with your multi-news data. Before we released the checkpoints and outputs, we...
> > Could you recheck your data preprocessing? Since you have no problem with CNN/DM, there may be something wrong with your multi-news data. > > Before we released the...
> 在您的实现中当结点是word时,传入的向量为[0,0,0,0,0,0,0,0] 不知道这边你具体指的是什么?传入向量具体是指node节点的哪一个feature? 不过我在检查相关代码的时候,确实发现实现上有一些遗漏。在WSGATLayer有一个edge_attention函数,针对edges.src['z'], edges.dst['z']进行attention weight的计算,但是在apply_edges之前只对其中一种类型(word)的z进行赋值。 https://github.com/dqwang122/HeterSumGraph/blob/d338dbedd6ccbb7e6a072c8c5171479b79a9b36d/module/GATLayer.py#L111 实际上由于edges是连接word和sent的,因此src和dst必然有一个z是默认赋值的(也就可能是你提到的默认0的向量赋值)。因此这里其实没有很好的用到两端的信息。这个问题影响WSGATLayer和WSGATLayer,对两端节点类型相同的SGATLayer没有影响。 因为代码涉及到checkpoint的更改,因此对于此问题的修复我放在dev分支中,并且会在readme中注明。感谢你的提醒~
> word结点和sent结点向量的维度并不相同,似乎无法在GAT计算中通过同一个映射函数 我猜你指的是不是这个: https://github.com/dqwang122/HeterSumGraph/blob/4bf23141c79794383dfa01d558244a4c868763b2/HiGraph.py#L96 在HSG和HDSG里面分别存在n_feature_proj和dn_feature_proj函数将节点维度映射到同一纬度
在这里GAT的实现,word和sent/doc不需要维度一致,在self.word2sent和self.sent2word的初始化可以看到,对应的in_dim和out_dim是不一样的: ``` python # word -> sent embed_size = hps.word_emb_dim self.word2sent = WSWGAT(in_dim=embed_size, out_dim=hps.hidden_size, num_heads=hps.n_head, attn_drop_out=hps.atten_dropout_prob, ffn_inner_hidden_size=hps.ffn_inner_hidden_size, ffn_drop_out=hps.ffn_dropout_prob, feat_embed_size=hps.feat_embed_size, layerType="W2S" ) # sent -> word self.sent2word = WSWGAT(in_dim=hps.hidden_size, out_dim=embed_size, num_heads=6,...
是的,如果你指的是dev分支中的更新,那么确实需要统一word和sent的维度。抱歉,我以为你指的是master分支的实现。 在dev中,需要将hidden size统一成embed_size确保word和sent都可以通过WSGATLayer和SWGATLayer同一个fc函数来映射。 不过更好的实现可能是再添加一个fc函数来进行out_dim和out_dim之间的映射,比如在WSGATLayer中: ```python # init self.wsfc = nn.Linear(in_dim, out_dim, bias=False) self.ssfc = nn.Linear(out_dim, out_dim, bias=False) # forward g.nodes[wnode_id].data['z'] = self.wsfc(srch) g.nodes[snode_id].data['z'] = self.ssfc(dsth) ``` 由于实现上的区别,在可能在超参设置上需要进行一定调整。
You can refer to https://github.com/dqwang122/HeterSumGraph/issues/7#issuecomment-667178514 for a simple implementation. Presumm also provides a script for it. You can find it in https://github.com/nlpyang/PreSumm/blob/master/src/prepro/data_builder.py#L161.