livebot icon indicating copy to clipboard operation
livebot copied to clipboard

Several Issues I encountered in replicating the paper.

Open fireflyHunter opened this issue 4 years ago • 1 comments

Hi,

I am very interested in your work. In order to better understand your work, I have made several attempts to reproduce the baseline results reported in the paper.

When using the processed dataset provided in this repo, I get similar scores to those reported in this existing investigation issue. These results are summarised below. image

However, when I process the raw dataset and train it on the same model, the performance drops significantly:

image

I tried different partitions of the data, but results vary very little. In order to understand the reason for these differences, I examined the provided dataset and found that there are a number of instances of identical videos assigned with different video ids that appear in both the training and test sets. I think this situation may have arisen by crawling the same video multiple times in creation of the dataset.

There is clearly a very large performance gap between my results and the baselines in the paper, and I wonder whether these repeated videos are responsible for the significant differences in the results. I am wonder, could you tell me please:

  • is the provided processed dataset the exact same one that was used to produce the baselines in the paper?
  • is the repetitive data the only cause of the performance difference or are there further reasons that you are aware of that could explain these differences?

fireflyHunter avatar Apr 08 '20 12:04 fireflyHunter

We did a series of investigation and were finally getting close scores to the baselines:

image

We report the details of our findings in the paper: Response to LiveBot: Generating Live Video Comments Based on Visual and Textual Contexts

fireflyHunter avatar Jun 05 '20 11:06 fireflyHunter