Shun Zheng comments

Results 16 comments of


                                            Shun Zheng

something wrong in Chinese ？

I'm considering using Chinese characters to mimic English Words and it seems to work fine. (In python 3.6) ` string = '北京欢迎您 ! 北京欢...

How can we reproduce the results in the tables of your paper?

Thanks for the reply. However, I still cannot reproduce your results in Tables 2, 5, or 6. For example. Table 5 dentoes the setting of using a higher-capacity encoder, which...

How can we reproduce the results in the tables of your paper?

Got it. Thanks. I thought the precentage denotes the proportion of loss reduction, which should be (0.02023 - 0.01179) / 0.02023. While you used the percentage of loss increment when...

question about your evaluation

You should follow the evaluation logic in this repo. Thanks!

数据集来源

Sorry, we cannot share that.

使用bert，单句token长度大于512的情况

对于单句，目前的设置是大于128token位置的都删去。由于Doc2EDAG是文档级模型，天生可以处理多句输入，所以可以将超长句切短再输入。在本repo中，设置`--rearrange_sent == True`即可对自动实现这一功能（仅对规范中文文本有效）。

使用bert，单句token长度大于512的情况

1、gold和pred说明的是用于事件论元的候选实体来自于数据标注（gold）还是模型预测（pred）。 2、无论事件表格填充部分用的是哪种选项，NER部分展示的都是pred所对应的结果。

使用bert，单句token长度大于512的情况

请参考`eval.sh`文件

使用bert，单句token长度大于512的情况

@mrkdian 标注触发词需要额外的知识库（事件触发词表）以及额外的匹配规则（一篇文章多个词匹配上了，到底给哪个标成触发词？），这些是依赖触发词的方案在标注方面的痛点。我们对DCFEE的复现是基于它论文中的假设“包含了大多数事件论元的句子是触发事件的关键句"，最终的目标都是得到事件类型及其相应的事件记录。

Better performance when reproduce paper

It is reasonable since the results reported in the paper was based on the two-years-ago PyTorch and CUDA backends.