Danqing Wang comments

Results 21 comments of


                                            Danqing Wang

problem with pretrain model

Please check your hyper-parameters. It seems that you have changed several **hidden size** and can not load the pretrained model. If you want to load the pretrained model directly, you...

Question about R1, R2, RL score

Did you use the released checkpoint and set -m to 9 for multi-news datasets? Or could you test your ROUGE installation by using the released multi-news outputs to calculate the...

Question about R1, R2, RL score

> Yes, I get a ROUGE score on the published output and a 6% difference on the multipurpose news dataset from the data listed by the author What does "_multipurpose...

Question about R1, R2, RL score

Could you recheck your data preprocessing? Since you have no problem with CNN/DM, there may be something wrong with your multi-news data. Before we released the checkpoints and outputs, we...

Question about R1, R2, RL score

> > Could you recheck your data preprocessing? Since you have no problem with CNN/DM, there may be something wrong with your multi-news data. > > Before we released the...

> 在您的实现中当结点是word时，传入的向量为[0，0，0，0，0，0，0，0] 不知道这边你具体指的是什么？传入向量具体是指node节点的哪一个feature？不过我在检查相关代码的时候，确实发现实现上有一些遗漏。在WSGATLayer有一个edge_attention函数，针对edges.src['z'], edges.dst['z']进行attention weight的计算，但是在apply_edges之前只对其中一种类型（word）的z进行赋值。 https://github.com/dqwang122/HeterSumGraph/blob/d338dbedd6ccbb7e6a072c8c5171479b79a9b36d/module/GATLayer.py#L111 实际上由于edges是连接word和sent的，因此src和dst必然有一个z是默认赋值的（也就可能是你提到的默认0的向量赋值）。因此这里其实没有很好的用到两端的信息。这个问题影响WSGATLayer和WSGATLayer，对两端节点类型相同的SGATLayer没有影响。因为代码涉及到checkpoint的更改，因此对于此问题的修复我放在dev分支中，并且会在readme中注明。感谢你的提醒～

关于实现细节存在的问题

> word结点和sent结点向量的维度并不相同，似乎无法在GAT计算中通过同一个映射函数我猜你指的是不是这个： https://github.com/dqwang122/HeterSumGraph/blob/4bf23141c79794383dfa01d558244a4c868763b2/HiGraph.py#L96 在HSG和HDSG里面分别存在n_feature_proj和dn_feature_proj函数将节点维度映射到同一纬度

关于实现细节存在的问题

在这里GAT的实现，word和sent/doc不需要维度一致，在self.word2sent和self.sent2word的初始化可以看到，对应的in_dim和out_dim是不一样的： ``` python # word -> sent embed_size = hps.word_emb_dim self.word2sent = WSWGAT(in_dim=embed_size, out_dim=hps.hidden_size, num_heads=hps.n_head, attn_drop_out=hps.atten_dropout_prob, ffn_inner_hidden_size=hps.ffn_inner_hidden_size, ffn_drop_out=hps.ffn_dropout_prob, feat_embed_size=hps.feat_embed_size, layerType="W2S" ) # sent -> word self.sent2word = WSWGAT(in_dim=hps.hidden_size, out_dim=embed_size, num_heads=6,...

关于实现细节存在的问题

是的，如果你指的是dev分支中的更新，那么确实需要统一word和sent的维度。抱歉，我以为你指的是master分支的实现。在dev中，需要将hidden size统一成embed_size确保word和sent都可以通过WSGATLayer和SWGATLayer同一个fc函数来映射。不过更好的实现可能是再添加一个fc函数来进行out_dim和out_dim之间的映射，比如在WSGATLayer中: ```python # init self.wsfc = nn.Linear(in_dim, out_dim, bias=False) self.ssfc = nn.Linear(out_dim, out_dim, bias=False) # forward g.nodes[wnode_id].data['z'] = self.wsfc(srch) g.nodes[snode_id].data['z'] = self.ssfc(dsth) ``` 由于实现上的区别，在可能在超参设置上需要进行一定调整。

quesion about label

You can refer to https://github.com/dqwang122/HeterSumGraph/issues/7#issuecomment-667178514 for a simple implementation. Presumm also provides a script for it. You can find it in https://github.com/nlpyang/PreSumm/blob/master/src/prepro/data_builder.py#L161.