SegAnyGAussians icon indicating copy to clipboard operation
SegAnyGAussians copied to clipboard

replace SAM with SAM2.1 lead to bad result

Open qiuqingheng opened this issue 8 months ago • 4 comments

In my previous tests, global partitioning with sam2.1 worked better than with sam, so I only changed the code of the model in extract_segment_everything_masks.py.

`print("Initializing SAM...") sam2 = build_sam2(config_file = args.model_cfg, ckpt_path=args.sam_checkpoint_path, device='cuda', apply_postprocessing=False)

# custom
mask_generator = SAM2AutomaticMaskGenerator(
    model=sam2,
    points_per_side=32,
    pred_iou_thresh=0.88,
    box_nms_thresh=0.7,
    stability_score_thresh=0.95,
    crop_n_layers=0,
    crop_n_points_downscale_factor=1,
    min_mask_region_area=100,
)

downsample_manually = False
if args.downsample == "1" or args.downsample_type == 'mask':
    IMAGE_DIR = os.path.join(args.image_root, 'images')
else:
    IMAGE_DIR = os.path.join(args.image_root, 'images_' + str(args.downsample))
    if not os.path.exists(IMAGE_DIR):
        IMAGE_DIR = os.path.join(args.image_root, 'images')
        downsample_manually = True
        print("No downsampled images, do it manually.")

assert os.path.exists(IMAGE_DIR) and "Please specify a valid image root"
OUTPUT_DIR = os.path.join(args.image_root, 'sam_masks')
os.makedirs(OUTPUT_DIR, exist_ok=True)

print("Extracting SAM segment everything masks...")

for path in tqdm(sorted(os.listdir(IMAGE_DIR))):
    name = path.split('.')[0]
    img = cv2.imread(os.path.join(IMAGE_DIR, path))
    if downsample_manually:
        img = cv2.resize(img, dsize=(img.shape[1] // args.downsample, img.shape[0] // args.downsample), fx=1, fy=1,
                         interpolation=cv2.INTER_LINEAR)
    masks = mask_generator.generate(img)

    mask_list = []
    for m in masks:
        m_score = torch.from_numpy(m['segmentation']).float().to('cuda')

        if args.downsample_type == 'mask':
            m_score = torch.nn.functional.interpolate(m_score.unsqueeze(0).unsqueeze(0), size=(img.shape[0] // args.downsample, img.shape[1] // args.downsample) , mode='bilinear', align_corners=False).squeeze()
            m_score[m_score >= 0.5] = 1
            m_score[m_score != 1] = 0
            m_score = m_score.bool()

        if len(m_score.unique()) < 2:
            continue
        else:
            mask_list.append(m_score.bool())
    masks = torch.stack(mask_list, dim=0)
    torch.save(masks, os.path.join(OUTPUT_DIR, name + '.pt'))`

However, in the Cluster in 3D module of prompt_segmenting.ipynb file, sam2.1 got rather poor results

Image

In the original sam the results were good

Image

In my opinion, the effect of segmentation depends on the mask of the model, sam2.1 should have a better effect, is there any element affecting the result in other modules?

qiuqingheng avatar Apr 13 '25 14:04 qiuqingheng

Hi, thanks for letting us know this. We have not test SAM2 yet, which we think should be okay to SAGA. You may need to check the resulted 2D masks and make sure they are in high quality.

By the way, if you use SAM 1 for preprocessing, can you get good results in the same scene?

Jumpat avatar Apr 22 '25 02:04 Jumpat

@Jumpat Thank you very much for your reply. The original intention of my wanting to use sam2.1 is that sam2.1 can be trained .Its segmentation effect on the same picture is better than sam, and the ability of the corresponding object can be enhanced through training .The following is the segmentation of my blueberries by using sam2.1.

Image

The following is the segmentation using sam from your project.

Image

In my opinion, sam2.1 is more complete in segmentation. For this image, sam2.1 generated 42 masks and saved it as a.pt file according to your code. Unfortunately, the subsequent results were not good. I'm still conducting experiments to find the solution. 虽然感到不好意思,但如果作者您有空的话,我可以通过邮件向您提供更多信息,这是我的邮箱和微信:[email protected]、QiuQingHeng4869,真的非常感谢!

qiuqingheng avatar Apr 22 '25 06:04 qiuqingheng

你好,虽然从你提供的图上来看能取得正确的分割结果,但这个图显示的是SAM的结果还是SAGA的预测结果?有一定的可能SAGA难以处理这个场景,和具体的SAM / SAM2分割模型无关。

Jumpat avatar Apr 27 '25 08:04 Jumpat

@Jumpat 上面回答展示的是sam的二维图像分割结果,当我使用原项目的sam进行实验的时候,最终的gs分割结果是下面这样的

Image

但当我换成sam2.1之后,其二维图像的分割效果在上一个回答中有显示,但是最终saga分割结果却像下面这样:

Image 说实话我也换成sam_hq模型试了一下,甚至它的代码都不要改,但是最终效果也是不如原本的sam。

qiuqingheng avatar Apr 28 '25 10:04 qiuqingheng