About sequence formulation for instance segmentation
Excuse me, I am interested in Pix2Seq, and trying to better understand it. I wonder how instance segmentation targets are formulated for training. To be more specific,
- How to convert coco annotations into target sequences? (especially those with more than one polygon)
- Do the starting point and direction matter?
- Is there any design handling the varying length of target sequences?
It would be nice if you can provide these details at your convenience. Thank you!
How to convert coco annotations into target sequences? (especially those with more than one polygon)
we use a separator to indicate different polygon.s
Do the starting point and direction matter?
we randomly pick the starting point, and follow the same direction as in the annotation.
Is there any design handling the varying length of target sequences?
We use ending token to indicate the end of the sequence. We simply truncate the sequence if it turns out to be longer than predefined max seq len (which is rare and happens when the annotation is very fine-grained).
Hope this helps.
Thank you for detailed response. I have got it now! It's really surprising that such a simple solution achieves these good results.
How to do it, in more detail?