CRAFT-pytorch icon indicating copy to clipboard operation
CRAFT-pytorch copied to clipboard

about niter variable

Open ntdat017 opened this issue 5 years ago • 2 comments

I look thought source code and getDetBoxes_core() function but I can't understand what is institute of niter variable. I know niter like padding pixel but why it's has that formal's mathematics.

        segmap = np.zeros(textmap.shape, dtype=np.uint8)
        segmap[labels==k] = 255
        segmap[np.logical_and(link_score==1, text_score==0)] = 0   # remove link area
        x, y = stats[k, cv2.CC_STAT_LEFT], stats[k, cv2.CC_STAT_TOP]
        w, h = stats[k, cv2.CC_STAT_WIDTH], stats[k, cv2.CC_STAT_HEIGHT]
        niter = int(math.sqrt(size * min(w, h) / (w * h)) * 2)
        sx, ex, sy, ey = x - niter, x + w + niter + 1, y - niter, y + h + niter + 1
        # boundary check
        if sx < 0 : sx = 0
        if sy < 0 : sy = 0
        if ex >= img_w: ex = img_w
        if ey >= img_h: ey = img_h
        kernel = cv2.getStructuringElement(cv2.MORPH_RECT,(1 + niter, 1 + niter))
        segmap[sy:ey, sx:ex] = cv2.dilate(segmap[sy:ey, sx:ex], kernel)

ntdat017 avatar Nov 01 '19 15:11 ntdat017

@ntdat017 A silly answer to a wise question. The formula is based on empirical observation. :)

        niter = int(math.sqrt(size * min(w, h) / (w * h)) * 2)

Actually, there are two parts.

(1) size/(w*h) is the occupancy ratio of the text region over the rectangle text box. We found the fact that a single character region needs to be dilated more, and this makes it possible.

(2) min(w,h) is to make the dilation ratio proportional to the box height. I think this is more intuitive than the above formula.

This part is an open question, and any suggestions regarding post-processing will be helpful. :)

YoungminBaek avatar Dec 04 '19 08:12 YoungminBaek

@YoungminBaek A lot thanks for your answer and I get the idea behind this post-processing! But silly me, I still cann't understand what are the sx, sy, ex or ey and why they are necessary. At kernel = cv2.getStructuringElement(cv2.MORPH_RECT,(1 + niter, 1 + niter)), niter decides the size of kernel, why is that? Thank U!

Boatsure avatar Sep 23 '20 01:09 Boatsure