Support Prompting with Vague Segmentation Masks for Refinement
Background & Appreciation:
First, huge kudos 👏👏👏 for building BiRefNet—it produces exceptionally precise and smooth segmentation masks! The model’s ability to handle fine details while maintaining structural coherence is impressive and highly valuable for segmentation tasks.
Current Limitation of Box-Prompting:
While box-based prompting is a common approach, it intrinsically relies on cropping local regions of the image, discarding broader contextual information. This often leads to suboptimal segmentation in complex scenes (e.g., blurred edges, missing object parts, or inconsistent results when objects interact with their surroundings). SAM2, for example, shows greater flexibility with box prompts but still struggles with mask quality.
🚀 Enhancing BiRefNet with Mask-Based Prompting for Robust Refinement
To leverage the strengths of both SAM2’s generality and BiRefNet’s refinement capabilities, Is it possible to add support for coarse/fragmented masks as input prompts. Here’s the rationale:
SAM2 handles diverse prompts well but often outputs fragmented, low-confidence masks (e.g., scattered fragments, holes, or imprecise boundaries), especially for complex objects. BiRefNet as a Refinement Specialist: By accepting these coarse SAM2 masks as prompts, BiRefNet could refine them into polished, high-fidelity masks—effectively "healing" fragmentation while preserving SAM2’s broad segmentation coverage.
Would this align with BiRefNet’s roadmap?
For reference, you can check relevant issue https://github.com/facebookresearch/sam2/issues/327
Hi, @lucasgblu, thanks a lot for the comments and the reply in the refine_foreground issue :)
Yes, the box prompting is a very useful strategy for manually selecting the target. The crop method is only a temporary way to achieve this goal.
Actually, I also lean towards employing BiRefNet as a general refiner for models like SAM, which has a user-friendly interaction while producing relatively low-quality masks. I've talked with many users before, e.g., issue. It's 💯 in my todo list. However, I'm still lacking effort and time for it... When I have free time someday, I'll try to achieve that.
@ZhengPeng7 Thank you for the thoughtful response! 🙌
I’m thrilled to hear that refining SAM-like masks aligns perfectly with your vision for BiRefNet’s future—it’s encouraging to see our ideas converge! Completely understand the constraints on time and resources; open-source maintenance is no small feat.
Whenever this reaches the top of your todo list, I’ll be eagerly awaiting the update. Combining SAM’s flexibility with BiRefNet’s refinement capabilities would create such a powerful segmentation pipeline. For now, I’ll keep enjoying the excellent work you’ve already shipped!
Wishing you smooth progress whenever you dive into this—no rush but much excitement!
Thanks! Very glad to hear your interest and enthusiasm and see my work can do some help. I'll definitely make it possible once I have the time :)