BiRefNet
BiRefNet copied to clipboard
Guiding what to segment
Hi ! Firstly amazing work ! Not really an issue but more a question.
By reading the white paper, I am not sure if you can choose what to segment rather than just the foreground, (a bit like Segment Anything Model). As I am just an artist, I am not understanding everything. Could you confirm or deny it ?
Best regards,
Hi, thx for your interest :) By now, the target to segment can not be specified. What the model should segment is learnt from the dataset (e.g., salient object detection). However, in my mind, it's easy to do a little modification on the dataset to get boxes of targets, which can be used as a prompt for you to choose which object you want to segment. Do you have that need? If this is really a useful thing to people like you in development, I can also spare some time to try to do that.
I think that would be useful !
Alright, I'll try to spare some time to give it a try. Updates will be attached here (successful or not), you can wait for it.
Amazing ! Thank you !
any update on this for using a bounding box as input, just like SAM ?
It's still not, but may come out this week.
amazing !
Hi there :) I made a colab with box guidance for BiRefNet inference. You can try it now.
But now, the box info is manually put into the variable box
, which is not user-friendly.
I'll make a GUI to obtain the box info by drawings, and process multiple boxes.
Thanks, will check this out.
Very nice , did you train the model again ? and how were you able to get this dataset? @ZhengPeng7
Also make some comparison with SAM to see how the performance ?
Oh i saw the colab code , you are just passing the cropped part , nice hack
Very nice , did you train the model again ? and how were you able to get this dataset? @ZhengPeng7
Also make some comparison with SAM to see how the performance ?
Thanks for the suggestion! I'll make some comparisons between them.
Yeah, I used to want to train a new model with a prompt encoder like what SAM does. But after that, when I discussed with some others, this simple but useful modification came to the mind. So I immediately established this simple demo.
Thanks !
pred_pil = pred_and_show(box=[666, 250, 1100, 777])
Could you tell us how the coordinate of the box works ? I mean is top left pixel 0,0 ?
By the way I have done a comfy ui workflow that use SAM to get the BBOX before passing it to Birefnet
Thanks !
pred_pil = pred_and_show(box=[666, 250, 1100, 777])
Could you tell us how the coordinate of the box works ? I mean is top left pixel 0,0 ?
It's (x1, y1, x2, y2) as I wrote in colab before:
And thanks for that workflow file, I'll check it today :)
I used to want to train a new model with a prompt encoder like what SAM does
Thanks for your nice work. Have you tried to train BirefNet with prompt(Box) encoder and how was that?
Unfortunately, I'm in lack of GPUs now. When I have time and enough GPUs in the coming days, I'll still try to do that to see if it brings additional improvement.