pixel-level-contrastive-learning
pixel-level-contrastive-learning copied to clipboard
positive pairs
As the two crops have different scales in the original image, and after the network, the two outputs have the same size. So how did you find the positive pairs, i.e., how did the correspondence come from? After the two different transformations, the two crops are not aligned.
i have a coordinate matrix for each image, which i also crop and interpolate, so that i can then finally determine the positive pairs through some distance threshold
You mean the same crop and other transformations are adopted in both the image and its corresponding coordinates? So the transformed coordinates are correspond to a same coordinate system for the two crops.
Yes that's correct
@lucidrains thank you for the great work !
Could you please add some comments or documentation on how you're calculating these pairs ?
I have tried debugging and going step by step in your implementation, but there are still things I don't really understand.
For instance:
- In the paper they say: The distances are normalized to the diagonal length of a feature map, whereas it seems you are normalizing by the diagonal length of the actual image ?
- I also do not really understand the way you are doing this: feature map is first warped to the original image space
- How do you track features corresponding to a pixel in the original image space ?
Thanks again !
I have the same questions, did you get over it?