Open3DIS Reproduce ScanNet200 Results

Hi @PhucNDA ,

I try to reproduce ScanNet200 results. I prepare the data and follow your running code instructions. I get the below results:

ScanNet200 Evaluation
################################################
what           :      AP  AP_50%  AP_25%
################################################
Head AP        :   0.254   0.314   0.342
Common AP      :   0.209   0.259   0.282
Tail AP        :   0.212   0.260   0.295
Base AP        :   0.246   0.308   0.342
Novel AP       :   0.218   0.267   0.294
------------------------------------------------
AP             :   0.226   0.279   0.307
################################################

It seems that the results of 0.226, 0.279, 0.307 is a bit different from the paper's 0.237, 0.294, 0.328. Is the gap within an acceptable range? Or am I missing some steps during reproducing?

Here is the model I use:

Grounded-SAM (groundingdino_swint_ogc, sam_vit_h_4b8939), CLIP (ViT-L/14@336px)

I only change the file path in config and the agnostic flag into False. Here is my config:

proposals: p2d: True # 2D branch p3d: True # 3D branch agnostic: False refined: True

Here is my understanding of running the code:

grounding_2d.sh: Generate the 2D masks (maskGdino) and first stage feature (grounded_feat). This step taks a lot of hours to run.
generate_3d_inst.sh: Generate 3D instances (hier_agglo) from 2D masks using hierarchical agglomerative clustering.
refine_grounding_feat: Refine second stage feature (hier_agglo) from 3D instances, output refine features (refined_grounded_feat)
generate_3d_inst.sh: Finalize the 3D output masks from refine features (refined_grounded_feat). In this step, I change the bool here into False https://github.com/VinAIResearch/Open3DIS/blob/4b05043095aff1dcbc9882799d25e0fb6f4c86a9/tools/generate_3d_inst.py#L278 and the bool here into True https://github.com/VinAIResearch/Open3DIS/blob/4b05043095aff1dcbc9882799d25e0fb6f4c86a9/tools/generate_3d_inst.py#L296 to get the final output (final_result_hier_agglo) instead of 3D instances (hier_agglo).

I understand that it's hard to figure out the problem that I encounter. Your work is very cool. If you could help me out and give me some insight I would really appreciate it.

Thanks.

Oct 27 '24 12:10 Louis708

Hi @Louis708,

How did you generate 3D feature from ISBNet?

Oct 27 '24 12:10 PhucNDA

Hi @Louis708,

You may want to independently verify the results for the 3D backbone-only and 2D-only cases to help identify the bug. These specific results are available on our webpage. If you encounter any issues with the source code, please don’t hesitate to reach out to me.

Oct 27 '24 12:10 PhucNDA

Hi @PhucNDA ,

Thanks for very quick reply. I generate the 3D features from ISBNet using

cd segmenter3d/ISBNet/ python3 tools/test.py configs/scannet200/isbnet_scannet200.yaml pretrains/scannet200/head_scannetv2_200_val.pth in https://github.com/VinAIResearch/Open3DIS/blob/main/docs/DATA.md#3d-backbone

I will check the results for the 3D backbone-only and 2D-only on ScanNet 200 later.

Thanks

Oct 27 '24 13:10 Louis708

Hi @PhucNDA ,

I try to reproduce ScanNet200 results. I prepare the data and follow your running code instructions. I get the below results:
ScanNet200 Evaluation
################################################
what           :      AP  AP_50%  AP_25%
################################################
Head AP        :   0.254   0.314   0.342
Common AP      :   0.209   0.259   0.282
Tail AP        :   0.212   0.260   0.295
Base AP        :   0.246   0.308   0.342
Novel AP       :   0.218   0.267   0.294
------------------------------------------------
AP             :   0.226   0.279   0.307
################################################
It seems that the results of 0.226, 0.279, 0.307 is a bit different from the paper's 0.237, 0.294, 0.328. Is the gap within an acceptable range? Or am I missing some steps during reproducing?

Here is the model I use:

Grounded-SAM (groundingdino_swint_ogc, sam_vit_h_4b8939), CLIP (ViT-L/14@336px)

I only change the file path in config and the agnostic flag into False. Here is my config:

proposals: p2d: True # 2D branch p3d: True # 3D branch agnostic: False refined: True

Here is my understanding of running the code:

grounding_2d.sh: Generate the 2D masks (maskGdino) and first stage feature (grounded_feat). This step taks a lot of hours to run.

generate_3d_inst.sh: Generate 3D instances (hier_agglo) from 2D masks using hierarchical agglomerative clustering.

refine_grounding_feat: Refine second stage feature (hier_agglo) from 3D instances, output refine features (refined_grounded_feat)

generate_3d_inst.sh: Finalize the 3D output masks from refine features (refined_grounded_feat). In this step, I change the bool here into False https://github.com/VinAIResearch/Open3DIS/blob/4b05043095aff1dcbc9882799d25e0fb6f4c86a9/tools/generate_3d_inst.py#L278

and the bool here into True https://github.com/VinAIResearch/Open3DIS/blob/4b05043095aff1dcbc9882799d25e0fb6f4c86a9/tools/generate_3d_inst.py#L296

to get the final output (final_result_hier_agglo) instead of 3D instances (hier_agglo).

I understand that it's hard to figure out the problem that I encounter. Your work is very cool. If you could help me out and give me some insight I would really appreciate it.

Thanks.

Would you mind sharing how did you modify the eval.py and your GT file plz?

Nov 12 '24 07:11 sgmzhou4