Kaiming He
comments of
Kaiming He
You reported a low number of MoCo with the NCE loss. This is because your implementation of NCE is problematic and correcting it should gives a more reasonable MoCo w/...
0.99 for updating Z works well. In ImageNet-1K, MoCo with NCE is ~2% worse than MoCo with InfoNCE, similar to the case of the memory bank counterpart.