pix2seq icon indicating copy to clipboard operation
pix2seq copied to clipboard

About sequence formulation for instance segmentation

Open volgachen opened this issue 3 years ago • 3 comments

Excuse me, I am interested in Pix2Seq, and trying to better understand it. I wonder how instance segmentation targets are formulated for training. To be more specific,

  • How to convert coco annotations into target sequences? (especially those with more than one polygon)
  • Do the starting point and direction matter?
  • Is there any design handling the varying length of target sequences?

It would be nice if you can provide these details at your convenience. Thank you!

volgachen avatar Jul 28 '22 09:07 volgachen

How to convert coco annotations into target sequences? (especially those with more than one polygon)

we use a separator to indicate different polygon.s

Do the starting point and direction matter?

we randomly pick the starting point, and follow the same direction as in the annotation.

Is there any design handling the varying length of target sequences?

We use ending token to indicate the end of the sequence. We simply truncate the sequence if it turns out to be longer than predefined max seq len (which is rare and happens when the annotation is very fine-grained).

Hope this helps.

chentingpc avatar Aug 03 '22 01:08 chentingpc

Thank you for detailed response. I have got it now! It's really surprising that such a simple solution achieves these good results.

volgachen avatar Aug 05 '22 02:08 volgachen

How to do it, in more detail?

gg22mm avatar Nov 14 '23 08:11 gg22mm