TransferNet icon indicating copy to clipboard operation
TransferNet copied to clipboard

Results on WebquestionSP can't reimplement, 0.71 in Paper, but 0.608 I run

Open Xie-Minghui opened this issue 3 years ago • 9 comments

I run the code using the same random seed. (I don't change your code), the final results are as follows: image 1-hop accuracy is 0.732, 2-hop is 0.445, the total accuracy is 0.608. But the result of WebquestionSP in your paper is 0.714, which is much bigger than 0.608. I worder how you train your model, or this is a mistake result of your paper

Xie-Minghui avatar Oct 29 '21 11:10 Xie-Minghui

image image

This is our checkpoint information. It seems that your loss is higher than ours. I do not know whether it is because some inconsistency of BERT initialization or huggingface version. Actually, we have observed unexpected performance drop with different BERT initialization.

shijx12 avatar Nov 11 '21 08:11 shijx12

I also run the code of ComplexWebQuestion, the results are val: 0.49,test: 0.45. The result in paper is 48.7. When you run you code, what the results on validation set and test set are? Thank you

Xie-Minghui avatar Nov 11 '21 08:11 Xie-Minghui

image image

This is our checkpoint information. It seems that your loss is higher than ours. I do not know whether it is because some inconsistency of BERT initialization or huggingface version. Actually, we have observed unexpected performance drop with different BERT initialization.

How you initialization bert? I just use the same code as yours

Xie-Minghui avatar Nov 11 '21 08:11 Xie-Minghui

I also run the code of ComplexWebQuestion, the results are val: 0.49,test: 0.45. The result in paper is 48.7. When you run you code, what the results on validation set and test set are? Thank you

We get 48.6 val accuracy. We report the results on validation set for CompWebQ in the paper.

shijx12 avatar Nov 11 '21 08:11 shijx12

How you initialization bert? I just use the same code as yours

We download the bert weights from Huggingface. However, Huggingface upgrades their API and model weights, which may cause some unexpected performance issues. We have observed it several times recently. We are uploading our checkpoint and will share the link to you soon.

shijx12 avatar Nov 11 '21 08:11 shijx12

Please refer to https://cloud.tsinghua.edu.cn/f/786b9853c1d840578025/?dl=1 for our log and checkpoint.

ShulinCao avatar Nov 18 '21 11:11 ShulinCao

Please refer to https://cloud.tsinghua.edu.cn/f/786b9853c1d840578025/?dl=1 for our log and checkpoint.

您好,请问可以上传您使用的Bert模型文件(pt和vocab.txt,config.json等文件),或者Bert版本也行?

Xie-Minghui avatar Nov 18 '21 11:11 Xie-Minghui

我们实验基本是2020年9到11月份做的,后来那台机器重装,我们只将代码和实验结果备份了出来,没有保留 ~/.cache 之类的。Bert 版本应该是某一个 2020 年的 Transformer 版本,我们后面尝试一下看能否找回当时的版本。

shijx12 avatar Nov 30 '21 07:11 shijx12

Please refer to https://cloud.tsinghua.edu.cn/f/786b9853c1d840578025/?dl=1 for our log and checkpoint.

Hello! Very disturbing, is there a CWQ log?

Huiopfsdfsdf avatar Sep 01 '23 13:09 Huiopfsdfsdf