CRAFT-Reimplementation
CRAFT-Reimplementation copied to clipboard
Questions on scaling images and GT masks
- In
data_loader.py
,pull_item
function:
region_scores = self.resizeGt(region_scores)
affinity_scores = self.resizeGt(affinity_scores)
confidence_mask = self.resizeGt(confidence_mask)
and the function definition of resizeGt
is:
def resizeGt(self, gtmask):
return cv2.resize(gtmask, (self.target_size // 2, self.target_size // 2))
Why do you resize the scales to half the target size?
- In the same function, you perform element-wise dividsion on
region_scores
andaffiity_scores
:
region_scores_torch = torch.from_numpy(region_scores / 255).float()
affinity_scores_torch = torch.from_numpy(affinity_scores / 255).float()
why?
-
random_scale
usesself.target_size
as the minimum dimension size and uses1280
as the maximum. This means the image and char boxes can fit anywhere between 1280 andself.target_size
. So what happens if the image is larger than 768? How do you gurantee that it will be 768? You don't seem to rescale the image afterrandom_scale
.
@ThisIsIsaac Q1: The output map down sample the input to 1/2 size Q2: Region and affinity score is between 0~1 Q3: Random crop the image to 768*768 I think you can read the author's paper, you can get more details.