Kun Su
Kun Su
Hi, I have a similar question. Have you figured it out by any chance? In general, I am trying to extract a correspondence map but not sure how to do...
Sure, in translate.py line 193, you have the srnn_loss for one action. However, after the for loop of all actions, in line 472, you print out the srnn_loss, which is...
@Miss-DN Hi, the view-invariant transformation is very important in our case and that's why we put it at the very beginning of the section of Methods. However, I don't think...
Thanks for your reply. I am also wondering if I try to do some kind of augmentation by changing the fps from 20 to another number like 60, how should...
Thanks for your reply! I am also curious about the training convergence and finding the best model between variants. I think the best eval loss could be varied according to...
Thanks again for your reply. Regarding linear probing, have you tried using CLS token output instead of average pooling the rest of the encoder output features? I saw that in...
Yeah, that's what I thought. I can tune the learning rate, but is there any particular reason that the contrastive loss needs a smaller learning rate? In Table 3, Audio-Visual...
Hi Yuangong, I am wondering whether you could provide the script for downloading the AudioSet via youtube-dl? I tried myself on eval set so far but found some videos were...
Hi @YuanGongND , Thanks for your reply. In terms of audio-only, I have yet to find a link that could download all the audio files. It would be super helpful...
@YuanGongND Got it! Thanks!