ImageBind ImageBind with SAM Simple Demo: Segment with Different Modalities

ImageBind with SAM Simple Demo: Segment with Different Modalities

Open rentainhe opened this issue 1 year ago • 4 comments

Thanks a lot for release such an amazing work!

We implement a simple and interesting demo by combing ImageBind with SAM here: ImageBind-SAM which can segment things with different modalities, and the project is still under develop

This basic idea is followed with IEA: Image Editing Anything and CLIP-SAM which generate the referring mask with the following steps:

Step 1: Generate auto masks with SamAutomaticMaskGenerator
Step 2: Crop all the box region from the masks
Step 3: Compute the similarity with cropped images and different modalities
Step 4: Merge the highest similarity mask region

And the result is shown as:

Input Model	Modality	Generate Mask
	car audio
	"A car"

And the threshold for each box will influence a lot on the final result, we will do more test on it!

May 16 '23 15:05 rentainhe

ImageBind ImageBind copied to clipboard

ImageBind with SAM Simple Demo: Segment with Different Modalities

ImageBind
ImageBind copied to clipboard