Yuxuan Zhang
Yuxuan Zhang
we freeze sam during clip experiments.
You are right. There were no negative examples in training datasets. We are looking for datasets containing negative examples. Currently we only find G-RefCOCO containing a few negative examples but...
We've employed G-RefCOCO to joint train our model but find that such cases still exist. Honestly it is a shared problem of existing RES models because of limited negative examples...
Haven't encounter similar problem. Could you provide detailed compiling environment? You may try: 1. `pip install -U ninja` 2. check if your python version, cuda version, pytorch version, cuda driver...
The main differences between sam1 and sam2 lie in: 1. pre-process: sam2 uses resize(1024), while sam1 uses resizelongest(1024) + padding. 2. sam2 uses hierachical image encoder and sam1 uses ViT...
The augmentations influence the model performance in another way. In referring segmentation tasks, text prompts contain geometric words like "on the left". Once flipping or cropping or some other augmentations...
That's amazing. What aug did you use? Could it be the reason that the aug didn't applied to the source fed to multi_model_extractor? Curious about this bug, honestly.
In fact we use no aug when training our model. It is so strange that scale jittering affect performance of sam2. Inform me if you find out any other reasons,...