FoundationPose getting the mask of first frame without using XMem or SAM as a preprocessing

This is a followup question.

Is there a way to make FoundationPose work with only the 2D bounding box of he object of interest?

Has anyone streamlined it so that there was no need for providing the mask of the first frame as a pre-requisite?

Also, I am not sure how I can provide the pre-req by clicking on one single point on the object. Can someone please walk me through?

Apr 11 '24 17:04 monajalal

Is there a way to make FoundationPose work with only the 2D bounding box of he object of interest?

Yes, you can convert the bbox to a segmentation mask and run the same way. It will work fine. To convert, make the pixels inside the box >0 and background==0.

Apr 11 '24 18:04 wenbowen123

Thanks for your response. Could you please clarify this or please link me to a reference? Any chance you may be able to provide an example of this?

Yes, you can convert the bbox to a segmentation mask and run the same way. It will work fine. To convert, make the pixels inside the box >0 and background==0.

Apr 12 '24 14:04 monajalal

Do you expect the performance to drop if I use 2D bbox instead of segmentation mask?

Apr 12 '24 14:04 monajalal

Suppose your bbox is [umin, vmin, umax, vmax]

mask = np.zeros((height, width), dtype=bool)
mask[vmin:vmax, umin:umax] = 1

Apr 12 '24 17:04 wenbowen123

no, it should work as good as the segmentation. I've tried this many times.

Apr 12 '24 17:04 wenbowen123

@wenbowen123 Thanks a lot for your guidance. I just wanted to confirm I was able to perform FoundationPose with only 2D bbox of first frame in yolox format and converting it to binary mask.

Apr 15 '24 18:04 monajalal

yes

Apr 16 '24 04:04 wenbowen123

@wenbowen123 In this case, the generated mask will be completely white. Will it still work?

Apr 25 '24 15:04 abhishekmonogram

@wenbowen123 In this case, the generated mask will be completely white. Will it still work?

the area inside the 2D box will be all white, yes, this will be fine.

Apr 25 '24 17:04 wenbowen123