livebot icon indicating copy to clipboard operation
livebot copied to clipboard

Problem in reproducing the paper results

Open PKULiuHui opened this issue 5 years ago • 1 comments

Hello, I have read your paper and trained the transformer model with the processed dataset. However, I can't reproduce the results in the paper.

The results in the paper are showed below: pic1 We can see that the model trained with both video and comment performs best.

I download the code and the processed dataset, train the model on a single GPU(it takes about 2.5 days for 50 epoches), and test the model on the final epoch, just like the instructions in README.md(I also test other checkpoints and the results are close). I set n_img to 0 and n_com to 0 respectively and train two another models to get the Comment_Only and Video_Only results. pic2 The values are much higher than the results in your paper and Video_Only model shows an extremely high performance, which is counter-intuitive.

So I check the source code and find something strange(maybe a bug?). The line 155 in transformer.py is showed below. pic4 It returns the loss rank of 100 candidate comments for each test item and the evaluation metrics are calculated accordingly. It should be a list ranked by the candidates' log-likelihood scores descendingly. However, the CrossEntropy Loss is contrary to the log-likelihood score, a lower loss means a higher log-likelihood score. So I think the loss should be ranked ascendingly. I fix the code line and run the test again. pic3 This time, the model trained with both video and comment performs best as wished. However, the value is much lower than the results in your paper.

So I have some questions and expect your reply:

  1. Is there really a bug in transformer.py line 155 or it is just my misunderstanding?
  2. Why the results are so different from the paper(too high and strange on the original code, too low on my fixed code)? Is the processed dataset is what you use in the paper? Or is there something else wrong in the code?
  3. What's your result after runing the code in the repository? I have trained two times and get close results showed above.

PKULiuHui avatar Aug 10 '19 10:08 PKULiuHui

I am getting similar scores, any update on this?

fireflyHunter avatar Feb 12 '20 12:02 fireflyHunter