CRAFT-pytorch
CRAFT-pytorch copied to clipboard
about niter variable
I look thought source code and getDetBoxes_core() function but I can't understand what is institute of niter variable. I know niter like padding pixel but why it's has that formal's mathematics.
segmap = np.zeros(textmap.shape, dtype=np.uint8)
segmap[labels==k] = 255
segmap[np.logical_and(link_score==1, text_score==0)] = 0 # remove link area
x, y = stats[k, cv2.CC_STAT_LEFT], stats[k, cv2.CC_STAT_TOP]
w, h = stats[k, cv2.CC_STAT_WIDTH], stats[k, cv2.CC_STAT_HEIGHT]
niter = int(math.sqrt(size * min(w, h) / (w * h)) * 2)
sx, ex, sy, ey = x - niter, x + w + niter + 1, y - niter, y + h + niter + 1
# boundary check
if sx < 0 : sx = 0
if sy < 0 : sy = 0
if ex >= img_w: ex = img_w
if ey >= img_h: ey = img_h
kernel = cv2.getStructuringElement(cv2.MORPH_RECT,(1 + niter, 1 + niter))
segmap[sy:ey, sx:ex] = cv2.dilate(segmap[sy:ey, sx:ex], kernel)
@ntdat017 A silly answer to a wise question. The formula is based on empirical observation. :)
niter = int(math.sqrt(size * min(w, h) / (w * h)) * 2)
Actually, there are two parts.
(1) size/(w*h) is the occupancy ratio of the text region over the rectangle text box. We found the fact that a single character region needs to be dilated more, and this makes it possible.
(2) min(w,h) is to make the dilation ratio proportional to the box height. I think this is more intuitive than the above formula.
This part is an open question, and any suggestions regarding post-processing will be helpful. :)
@YoungminBaek
A lot thanks for your answer and I get the idea behind this post-processing! But silly me, I still cann't understand what are the sx, sy, ex or ey and why they are necessary. At kernel = cv2.getStructuringElement(cv2.MORPH_RECT,(1 + niter, 1 + niter))
, niter decides the size of kernel, why is that?
Thank U!