Mouxing Young
Mouxing Young
> You may try the code in https://github.com/mangye16/Cross-Modal-Re-ID-baseline Thanks for your reply. I have tried the recommend code using triplet loss and the proposed wrt loss, respectively. However, the one...
Hi~ I have tried the recommend code using triplet loss and the proposed wrt loss, respectively. However, the one with triplet loss exceeds that with wrt loss by ~7% and...
> > Hi~ > > It's a nice job. The proposed CC loss seems to narrow the modality gap, which is often done by triplet loss. So, how about replacing...
In my implementation, MPANet with cc loss exceeds that with triplet loss by nearly 10% in terms of the performance. I really wonder why the cc loss could help the...
Hi, Sorry for the late reply, and thanks for your interest. For SAR and EATA, we used the code from [this repository](https://github.com/mr-eggplant/SAR) and modified the input and output parameters without...
Hi, we follow CAV-MAE (Gong et al.) and first extract 10 frames for each video. Then, we add corruption for images following the ImageNet-C benchmark. As for the audio, we...
Hi, We just followed the finetuning pipeline of [CAV-MAE](https://github.com/YuanGongND/cav-mae/blob/master/egs/vggsound/run_cavmae_ft.sh) on the VGGSound dataset to get cav_mae_ks50.pth. The main modification is that I replaced the label weight file (NOTE: not model...
Hi, Sorry for the delayed response. I’ve downloaded the repo, conducted the experiment on VGGSound with Gaussian-5 using the released CMD, and successfully reproduced the results (~40.3). I noticed that...