facer
facer copied to clipboard
The predicted semantic mask has a slight offset.
When I test the new published pretrained model, I found that semantic segmentation results and input images do not match perfectly. I looked through the'tanh_warp'-related processing and found that the coordinates of'grid_sample' might have a slight problem. In the previous procedure, the'align_corners' option is disabled by default, so the sampling coordinates should be (0, n) instead of (0, n-1). So I changed the code a little and found that the effect was significantly improved.
facer.facer.transform.py line 218 & 219 yy = yy.unsqueeze(0).broadcast_to(batch_size, h, w).to(device) xx = xx.unsqueeze(0).broadcast_to(batch_size, h, w).to(device)
change to
yy = yy.unsqueeze(0).broadcast_to(batch_size, h, w).to(device)+0.5 xx = xx.unsqueeze(0).broadcast_to(batch_size, h, w).to(device)+0.5
Are the above changes reasonable?
Another question, does face_parsing have a resolution limit on the input image?
I also found that that the suggested +0.5 appears to fix the registration.