FBA_Matting icon indicating copy to clipboard operation
FBA_Matting copied to clipboard

Distance Transform

Open bluesky314 opened this issue 4 years ago • 6 comments

Hey, I do not understand how distance map is being used here and what the clicks variable is exactly supposed to represent:

def dt(a):
    return cv2.distanceTransform((a * 255).astype(np.uint8), cv2.DIST_L2, 0)

def trimap_transform(trimap):
    h, w = trimap.shape[0], trimap.shape[1]

    clicks = np.zeros((h, w, 6))
    for k in range(2):
        if(np.count_nonzero(trimap[:, :, k]) > 0):
            dt_mask = -dt(1 - trimap[:, :, k])**2
            L = 320
            clicks[:, :, 3*k] = np.exp(dt_mask / (2 * ((0.02 * L)**2)))
            clicks[:, :, 3*k+1] = np.exp(dt_mask / (2 * ((0.08 * L)**2)))
            clicks[:, :, 3*k+2] = np.exp(dt_mask / (2 * ((0.16 * L)**2)))

    return clicks

Can you please explain this?

bluesky314 avatar Jun 29 '20 13:06 bluesky314

Instead of only feeding the network the binary trimap we also feed the distance transformed version. The distance from the definite foreground and background regions is a strong indicator of what the alpha could be.

The clicks variable represents the transformed trimap. The variable name is not the most accurate and will eventually be fixed.

MarcoForte avatar Jul 03 '20 11:07 MarcoForte

Ok but it does not seem like a simple distance transform. What do 3k,3k+1,3*k+2 and 2 * ((0.02 * L)**2), 2 * ((0.08 * L)**2) ... mean in the for loop? What is the whole loop doing? And why is clicks of dimension 6?

bluesky314 avatar Jul 03 '20 17:07 bluesky314

The distance transform is used to compute an approximate alpha matte based on the trimap.

The first function which is used here goes to 0 at approximately a distance for 25 pixels, the second function goes to 0 for a distance of 100 pixels and the third function for 200 pixels.

Here is a plot of the distance of the distance vs the approximate alpha matte value for the first function:

https://www.wolframalpha.com/input/?i=plot+e%5E%28-%28%281-x%29%5E2%29+%2F+%282++%28%280.02++320%29%5E2%29%29%29

clicks has 6 channels because the three distances are computed to both the fixed foreground and background of the trimap.

99991 avatar Jul 16 '20 06:07 99991

Thanks @99991 and @MarcoForte , but I dont see how distance transform makes sense as the images are all on different scales. Distance in pixel space which is used by distance transform may not mean much when the image is a close-up of a person's face because all points are close by the object of interest.

bluesky314 avatar Aug 08 '20 10:08 bluesky314

@99991 , @MarcoForte Can either of you clarify if I am getting something wrong about the appropriateness of distance maps?

bluesky314 avatar Aug 13 '20 03:08 bluesky314

@99991 , @MarcoForte Did you guys get to think about the above point that distance transform not taking scale of image into account?

bluesky314 avatar Nov 28 '20 10:11 bluesky314