APro
APro copied to clipboard
Question about the annotation-free results on VOC12
Greetings!
Thanks for your great work, which has impressive motivation and results. After the reading, I met a problem: Is the annotation-free segmentation on VOC12 truly without any annotation? If the answer is yes, what do you think of CLIP's annotation-free ability? Even surpass some weakly-supervised semantic segmentation methods?
Looking forward to discussing with you!
@Unrealluver Yes, the CLIP-guided semantic segmentation is genuinely free from any annotations, following the setting in MaskCLIP+. Its robust performance can be attributed to the priori knowledge of CLIP's exceptional open-set recognition capabilities, setting it apart from other weakly-supervised semantic segmentation methods. As for CLIP's annotation-free ability, it can coarsely segment areas within an image, while the results are generally not very precise. How to refine segmentation with accurate details based on pretrained CLIP model is a challenging problem. Our APro provided one solution.