SegAnyGAussians icon indicating copy to clipboard operation
SegAnyGAussians copied to clipboard

Setting scale when evaluating the segmentation performance

Open MinjiK11 opened this issue 7 months ago • 3 comments

Hi, thank you for interesting works! I have quesiton about setting a scale when evaluating the 3D segmentation.

In the evaluation code, it seems like it searches the segmented object only at single given scale (upper bound of mask scales).

(in provided eval_3dovs.ipynb notebook)

scale = upper_bound_scale
scale = torch.full((1,), scale).cuda()
scale = q_trans(scale)
gates = scale_gate(scale)

As it searches at single scale, it fails to segment small objects like egg in bowl in ramen scene of LeRF OVS dataset. Should I modify the code so that it searches through multiple scale and selects one with the best segmentation result?

MinjiK11 avatar May 21 '25 13:05 MinjiK11

This is a challenging issue. As discussed in the appendix of our paper, SAGA does encounter multi-scale ambiguity when applied to open-vocabulary segmentation in complex scenes such as Lerf-OVS ramen. It’s unclear whether searching across multiple scales would effectively address this problem. Even worse, we currently lack a reliable metric to determine the optimal scale for a given text prompt. Using ground truth annotations to guide this selection would introduce data leakage.

Jumpat avatar May 29 '25 06:05 Jumpat

@Jumpat I also show that 360_v2 garden images can not be extracted into all scale file of pt format at the mask_scale folder. During the run time, just drop it silently, and there are a few files of pt. So I cannot do the next step.

Please upgrade to solve this issue at your code.

And if you use torch version > 2.3.1, which be much better for building environment.
Your current version is too low and shows some issue about using torch.eig()[=>torch.linalg.eig] at your pca function.

jeffhwang02 avatar Jun 08 '25 00:06 jeffhwang02

@Jumpat Thank you for your kind reply! I would like to ask another question about the performance of SAGA in different dataset. Do you think SAGA can work well in large-scale scene dataset like ScanNet (dataset of indoor scenes)? Personally, It seems it's hard for SAGA to learn the affinity feature of each Gaussian consistently for the large scenes.

MinjiK11 avatar Jun 12 '25 03:06 MinjiK11