AdaptFormer icon indicating copy to clipboard operation
AdaptFormer copied to clipboard

Two questions about the experimental results in Tabel 1 of the paper.

Open yangzhen1997 opened this issue 3 years ago • 5 comments

Hi, I would like to ask you two questions about the experimental results in the paper's Table 1. I would like to ask where the acc 53.97 of full tuning of ssv2 was obtained? 截屏2022-10-10 17 56 13 When I read VideoMAE, I found that pretrain on ssv2 and then finetune on ssv2 can get 69.3 results. I know your paper is using the K400's pretrain parameters, but I also did experiments and I can achieve 65+ results with 50 rounds finetune on ssv2: 1

  1. so my first question is where did you get 53.97 from?
  2. The second question is that the data in the chart below I also did not find in the table, is it written wrong? 截屏2022-10-10 18 10 30

yangzhen1997 avatar Oct 10 '22 10:10 yangzhen1997

Hi,

Thanks for your raised question.

  1. May I know your detailed configuration including command and pre-trained weights?

  2. Good catch. We are sorry for that typo. We updated the table while missing the main text. Thanks again for pointing it out and we've fixed it in our camera-ready version.

ShoufaChen avatar Oct 15 '22 01:10 ShoufaChen

Hi,

1 pre-trained weights: The pre-training weights I use is

https://drive.google.com/file/d/1JfrhN144Hdg7we213H1WxwR3lGYOlmIn/view

the location where the picture is shown in Annex 1.

2 shell: I basically did not change the shell relative to VideoMAE, but only changed the batch size from 256 to 64. The equivalent is now using the parameters of K400 pretrain 800 rounds and fintune 50 rounds on ssv2. The final results obtained are compared with the same case using ssv2 pretrain 1600 rounds, as the picture in Annex 2 shows, they both exceed 65+ acc1. And the shell is in the Annex 3.

yangzhen1997 avatar Oct 15 '22 02:10 yangzhen1997

Thanks for your reply. Did you experiment with the VideoMAE codebase?

I guess you experiment with strong data augmentation and optimizer (e.g. AdamW). For fair comparison to linear probling, We experiment with the same setting as linear probe, which uses SGD and does not contain strong data augmentation.

Please let me know if I miss something.

ShoufaChen avatar Oct 15 '22 02:10 ShoufaChen

Thanks for your reply. I will do the experiment to verify it again!

yangzhen1997 avatar Oct 15 '22 02:10 yangzhen1997

@ShoufaChen @yangzhen1997 I was also experiencing the same problem. Even though you removed those augmentations and AdamW optimizer, will your method still be able to improve the results? Based on my experiments, adding augmentations and AdamW optimizer did not improve (and sometimes degraded) the performance. This is because in full fine-tuning, they are used to reduce the model overfitting when tuning many parameters. However, in VPT and your method, since we are only tuning a small fraction of parameters, it does not improve model performance. Therefore, would it be fair to report the full fine-tuning results without any augmentations and sophisticated optimizers?

wgcban avatar Mar 02 '23 20:03 wgcban