segment-anything icon indicating copy to clipboard operation
segment-anything copied to clipboard

SOTA Model for Text Prompt Segmentation

Open xiaobanni opened this issue 1 year ago • 7 comments

I am looking for a state-of-the-art (SOTA) model for text prompt segmentation. Currently, I am aware of two choices: Grounded-Segment-Anything and SEEM. However, both of these models fail to meet my requirements.

Consider the following example: I want the model to segment the lane lines, but the results from the aforementioned methods are as follows (i hope they can segment the lane line in the road):

Grounded-Segment-Anything: image

SEEM Model: image

Unfortunately, neither of them can solve this problem effectively. I would greatly appreciate any recommendations you may have.

Any information regarding the timeline for the release of SAM text-prompt capabilities would be welcome.

xiaobanni avatar Sep 26 '23 08:09 xiaobanni

I recommend you this: https://github.com/luca-medeiros/lang-segment-anything

emi-dm avatar Sep 26 '23 09:09 emi-dm

Thank you for the recommendation. However, I have tried it and found that it is just an easier-to-read version of Grounded-Segment-Anything. It uses the same method of using GroundingDINO to translate the text prompt to a box prompt and then sending it to SAM, resulting in similar outcomes to the Grounded-Segment-Anything mentioned earlier. I believe that an oriented text prompt segment model (rather than the two-stage invoking) is necessary to address the issue at hand and facilitate broader downstream applications.

xiaobanni avatar Sep 26 '23 15:09 xiaobanni

I am looking for a state-of-the-art (SOTA) model for text prompt segmentation. Currently, I am aware of two choices: Grounded-Segment-Anything and SEEM. However, both of these models fail to meet my requirements.

Consider the following example: I want the model to segment the lane lines, but the results from the aforementioned methods are as follows (i hope they can segment the lane line in the road):

Grounded-Segment-Anything: image

SEEM Model: image

Unfortunately, neither of them can solve this problem effectively. I would greatly appreciate any recommendations you may have.

Any information regarding the timeline for the release of SAM text-prompt capabilities would be welcome.

Do you have any good solutions? I'm facing the same problem now

TerryYiDa avatar Oct 07 '23 09:10 TerryYiDa

@TerryYiDa No. So, I hope this issue can track the progress of the advanced text-prompt segmentation model.

xiaobanni avatar Oct 07 '23 10:10 xiaobanni

I have the same problem, do you find a solution?

iacopo97 avatar Nov 17 '23 16:11 iacopo97

Lol, really wish it was possible to open up the ability to use text prompts . A two-stage approach like Grounded-Segment-Anything is neither useful nor elegant.😣

YuetianW avatar Dec 13 '23 08:12 YuetianW

Anyone made progress with this issue?

muhammadsr avatar Mar 30 '24 12:03 muhammadsr