ncnet
ncnet copied to clipboard
Question about the correspondence calculation
Hi @ignacio-rocco , thanks for the great work! There are some parts of the code about how you recover the pixel position of matches from the correspondences that are not fully clear to me.
So as what I can see from your code, in lib/point_tnf.py
if delta4d is not None: \# relocalization
delta_iA,delta_jA,delta_iB,delta_jB = delta4d
diA=delta_iA.squeeze(0).squeeze(0)[iA.view(-1),jA.view(-1),iB.view(-1),jB.view(-1)]
djA=delta_jA.squeeze(0).squeeze(0)[iA.view(-1),jA.view(-1),iB.view(-1),jB.view(-1)]
diB=delta_iB.squeeze(0).squeeze(0)[iA.view(-1),jA.view(-1),iB.view(-1),jB.view(-1)]
djB=delta_jB.squeeze(0).squeeze(0)[iA.view(-1),jA.view(-1),iB.view(-1),jB.view(-1)]
iA=iA*k_size+diA.expand_as(iA)
jA=jA*k_size+djA.expand_as(jA)
iB=iB*k_size+diB.expand_as(iB)
jB=jB*k_size+djB.expand_as(jB)
You first rescale the indices in the lower resolution to the resolution before the 4D maxpooling and add the shift coming from the maxpooling operation. So the method corr_to_matches outputs the location of the correspondences in the 200x100 resolution. And the location is represented by the portion which is pixel/total_pixel_num.
Then in the eval_inloc.py line 179-190,
if k_size>1:
yA_=yA_*(fs1*k_size-1)/(fs1*k_size)+0.5/(fs1*k_size)
xA_=xA_*(fs2*k_size-1)/(fs2*k_size)+0.5/(fs2*k_size)
yB_=yB_*(fs3*k_size-1)/(fs3*k_size)+0.5/(fs3*k_size)
xB_=xB_*(fs4*k_size-1)/(fs4*k_size)+0.5/(fs4*k_size)
else:
yA_=yA_*(fs1-1)/fs1+0.5/fs1
xA_=xA_*(fs2-1)/fs2+0.5/fs2
yB_=yB_*(fs3-1)/fs3+0.5/fs3
xB_=xB_*(fs4-1)/fs4+0.5/fs4
I don’t understand why do you still need to do this recenter. And for me it is also not clear how this recenter works… It would be appreciated if you could help me to understand what you are doing! Thanks!