ImageBind icon indicating copy to clipboard operation
ImageBind copied to clipboard

ImageBind with SAM Simple Demo: Segment with Different Modalities

Open rentainhe opened this issue 1 year ago • 4 comments

Thanks a lot for release such an amazing work!

We implement a simple and interesting demo by combing ImageBind with SAM here: ImageBind-SAM which can segment things with different modalities, and the project is still under develop

This basic idea is followed with IEA: Image Editing Anything and CLIP-SAM which generate the referring mask with the following steps:

  • Step 1: Generate auto masks with SamAutomaticMaskGenerator
  • Step 2: Crop all the box region from the masks
  • Step 3: Compute the similarity with cropped images and different modalities
  • Step 4: Merge the highest similarity mask region

And the result is shown as:

Input Model Modality Generate Mask
car audio
"A car"

And the threshold for each box will influence a lot on the final result, we will do more test on it!

rentainhe avatar May 16 '23 15:05 rentainhe