Yonglong Tian
Yonglong Tian
Hi, @Ys-Jung77 , I don't know the difference between your and my settings, but the numbers reported in README are with batch size of 1024. E.g., SimCLR get 70.7 while...
@meyerjo , To your question, we used "patch-based contrastive objective" for this task, please refer to section 3.5.2. That being said, for each of the modality, we extract _global feature_...
@jnyjxn , Yes, NYU dataset only has less than 2k images. This is different from the ImageNet experiment, but has been described in the supplementary of the paper.
Hi, Thanks for your comment! Which specific result are you referring to? Or are you suggesting that an EMA of Z could potentially improve all InsDis, MoCo and CMC with...
@KaimingHe , yeah, probably the current NCE implementation is less suitable for MoCo, and I am happy to rectify it. What is the best momentum multiplier for updating Z you...
Thanks for your input! I have temporarily removed the NCE numbers in README to avoid any confusion, and will keep them vacant until I get a chance to look into...
Hi @PhilLint, You are right. For YCbCr I used the mean and std. For Lab, I just normalize the input to [-1, 1].
Hi, @macaodha , Good question! I think most of the previous literature used the so-called CaffeNet from the original caffe team, which should be 96,256,384,384,256. Here my model is also...
Hi, @chihyaoma , `AvgPool2d` is used in previous version of Pytorch's official implementation. Now they switch to `AdaptiveAvgPool2d`. I guess it's essentially the same in my case as the feature...
Hi, @IgorSusmelj , Thanks for pointing this out. That's what I mean, since the standard input is 224x224, then the size before the final pooling layer is always 7x7. Therefore...