yolov7_d2 mask offset

Hello! I was try to use yolov7 with this instruction: www.youtube.com/watch?v=qej73NGDQfo After train (total loss ~0,15) i tried to detect balloon on image. In result image mask has offset. The amount of displacement depends on image size. How can i fix it? mask offset Thank you!

Sep 29 '22 18:09 avtregubov

Can you comment on the input-resolutions of the original image, the expected model-input-resolution, the model output-resolution and the retrieved mask resolution? It looks like the mask's resolution is not scaled-back to the input's resolution - there is not only an offset (vertical and horizontal), but the ballon's shape is smaller.

Sep 30 '22 04:09 brmarkus

image in dataset have different resolutions (from 1895х2048 to 1024х683 ) i use config from https://github.com/jinfagang/yolov7_d2/blob/main/configs/coco-instance/yolomask.yaml and change only YOLO.CLASSES and DATASETS.TRAIN DATASETS.TEST settings

mask - torch.Size([1, 600, 388]) for input image 662x1024

Sep 30 '22 05:09 avtregubov

Hi, obviously the mask is not right. 2 things to check (You can send me an PR for fix):

the mask output shape should according to your model input shape (not image shape);
make sure the mask resized to original image shape (not model input shape).

Oct 01 '22 07:10 lucasjinreal

Hello,

I'm facing the same issue, that the masks are offset and have a different scale (like in the picture above) on my custom dataset.

Where exactly can I find the mask output shape or model input shape ?

Thank you!

Dec 07 '22 11:12 janikstfub

You might want to check with tools like "Netron" to get the model's architecture visualized - and then checking the (multiple) input(s) and (multiple) output(s).

Dec 07 '22 11:12 brmarkus

I fixed the mask offset so far for my custom dataset in which I changed in "https://github.com/jinfagang/yolov7_d2/blob/main/demo.py#L50" to image = original_image instead of image = self.aug.get_transform(original_image).apply_image(original_image). But I'm not sure if it fixes the problem in general or in every case...

Dec 08 '22 15:12 janikstfub

@janikstfub this augmentation is used for resize along shortest with original image, if you skip this step, your images is using raw as input, it should works well except your image size is diviable with 32.

Dec 09 '22 02:12 lucasjinreal

yolov7_d2 yolov7_d2 copied to clipboard

mask offset

yolov7_d2
yolov7_d2 copied to clipboard