BiRefNet icon indicating copy to clipboard operation
BiRefNet copied to clipboard

Guiding what to segment

Open hbardak opened this issue 10 months ago • 18 comments

Hi ! Firstly amazing work ! Not really an issue but more a question.

By reading the white paper, I am not sure if you can choose what to segment rather than just the foreground, (a bit like Segment Anything Model). As I am just an artist, I am not understanding everything. Could you confirm or deny it ?

Best regards,

hbardak avatar Apr 23 '24 07:04 hbardak

Hi, thx for your interest :) By now, the target to segment can not be specified. What the model should segment is learnt from the dataset (e.g., salient object detection). However, in my mind, it's easy to do a little modification on the dataset to get boxes of targets, which can be used as a prompt for you to choose which object you want to segment. Do you have that need? If this is really a useful thing to people like you in development, I can also spare some time to try to do that.

ZhengPeng7 avatar Apr 23 '24 07:04 ZhengPeng7

I think that would be useful !

hbardak avatar Apr 23 '24 07:04 hbardak

Alright, I'll try to spare some time to give it a try. Updates will be attached here (successful or not), you can wait for it.

ZhengPeng7 avatar Apr 23 '24 08:04 ZhengPeng7

Amazing ! Thank you !

hbardak avatar Apr 23 '24 18:04 hbardak

any update on this for using a bounding box as input, just like SAM ?

laxmaniron avatar Jun 02 '24 21:06 laxmaniron

It's still not, but may come out this week.

ZhengPeng7 avatar Jun 03 '24 04:06 ZhengPeng7

amazing !

hbardak avatar Jun 03 '24 19:06 hbardak

Hi there :) I made a colab with box guidance for BiRefNet inference. You can try it now. But now, the box info is manually put into the variable box, which is not user-friendly. I'll make a GUI to obtain the box info by drawings, and process multiple boxes.

ZhengPeng7 avatar Jul 17 '24 14:07 ZhengPeng7

Thanks, will check this out.

laxmaniron avatar Jul 17 '24 20:07 laxmaniron

Very nice , did you train the model again ? and how were you able to get this dataset? @ZhengPeng7

Also make some comparison with SAM to see how the performance ?

rishabh063 avatar Jul 18 '24 15:07 rishabh063

Oh i saw the colab code , you are just passing the cropped part , nice hack

rishabh063 avatar Jul 18 '24 15:07 rishabh063

Very nice , did you train the model again ? and how were you able to get this dataset? @ZhengPeng7

Also make some comparison with SAM to see how the performance ?

Thanks for the suggestion! I'll make some comparisons between them.

Yeah, I used to want to train a new model with a prompt encoder like what SAM does. But after that, when I discussed with some others, this simple but useful modification came to the mind. So I immediately established this simple demo.

ZhengPeng7 avatar Jul 19 '24 07:07 ZhengPeng7

Thanks !

pred_pil = pred_and_show(box=[666, 250, 1100, 777])

Could you tell us how the coordinate of the box works ? I mean is top left pixel 0,0 ?

hbardak avatar Jul 27 '24 10:07 hbardak

SAM_Birefnet1.json

By the way I have done a comfy ui workflow that use SAM to get the BBOX before passing it to Birefnet

hbardak avatar Jul 27 '24 10:07 hbardak

Thanks !

pred_pil = pred_and_show(box=[666, 250, 1100, 777])

Could you tell us how the coordinate of the box works ? I mean is top left pixel 0,0 ?

It's (x1, y1, x2, y2) as I wrote in colab before: 截屏2024-07-27 18 49 14

ZhengPeng7 avatar Jul 27 '24 10:07 ZhengPeng7

And thanks for that workflow file, I'll check it today :)

ZhengPeng7 avatar Jul 27 '24 10:07 ZhengPeng7

I used to want to train a new model with a prompt encoder like what SAM does

Thanks for your nice work. Have you tried to train BirefNet with prompt(Box) encoder and how was that?

SuyueLiu avatar Aug 15 '24 04:08 SuyueLiu

Unfortunately, I'm in lack of GPUs now. When I have time and enough GPUs in the coming days, I'll still try to do that to see if it brings additional improvement.

ZhengPeng7 avatar Aug 15 '24 06:08 ZhengPeng7