Pretrained-Pix2Seq About LargeScaleJitter

About LargeScaleJitter

Open ShuaiBai623 opened this issue 3 years ago • 3 comments

hi, great work! We also try to reimplement the Pix2Seq, we find the absolute coordinate is useful, which is similar to your LargeScaleJitter (pad or crop the image to the fix desired size), the absolute coordinate means that normalized the position by dividing the fix size. boxes = boxes / 1333. instead of boxes = boxes / torch.tensor([w, h, w, h], dtype=torch.float32),Then, padding or croppinf the image to the fix desired size is not necessary.

Nov 08 '21 03:11 ShuaiBai623

Thanks. We also use absolute coordinate as described in Pix2Seq. https://github.com/gaopengcuhk/Pretrained-Pix2Seq/blob/7d908d499212bfabd33aeaa838778a6bfb7b84cc/playground/pix2seq/pix2seq.py#L88-L90 The conversion of relative coordinate in transforms.py is because we use the same dataloader as DETR. And for the large Scale jittering, we basically follow the same pipeline proposed in CopyPaste which is cited in Pix2seq.

Nov 08 '21 06:11 hanqiu-hq

Hello, does using absolute coordinate gets a better AP?

Nov 09 '21 01:11 baiyongrui

The only difference between relative and absolute coordinate is the normalization factor ? Absolute will normalize by the longest image size instead of the actual image size, which relative coordinate does, am I right ?

Jan 25 '22 10:01 seanzhuh

Pretrained-Pix2Seq Pretrained-Pix2Seq copied to clipboard

About LargeScaleJitter

Pretrained-Pix2Seq
Pretrained-Pix2Seq copied to clipboard