EmbodiedScan icon indicating copy to clipboard operation
EmbodiedScan copied to clipboard

[Docs] Annotations for Monocular 3D Perception

Open chanhee-luke opened this issue 1 year ago • 5 comments

Branch

main branch https://mmdetection3d.readthedocs.io/en/latest/

📚 The doc issue

Hi, is the annotations for Monocular 3D Perception as referred to in the paper available to the public? I don't see the annotations in the data currently provided. Thanks!

-Luke

Suggest a potential alternative/fix

No response

chanhee-luke avatar Aug 06 '24 18:08 chanhee-luke

We provide visible_instance_ids and visible_occupancy_masks for each image. It's easy to construct Monocular setting using these masks.

mxh1999 avatar Aug 07 '24 04:08 mxh1999

Thanks! How do I get the visible_occupancy_masks for each image? Can you guide me on how to extract each one from occupancy annotations? I tried looking at the annotations and it was a little confusing.

chanhee-luke avatar Aug 07 '24 23:08 chanhee-luke

@chanhee-luke Following the guidance, you can find visible_occupancy.pkl for each scene. It is a list of visible_occupancy_annotation which contains the img_path and corresponding visible_occupancy.

mxh1999 avatar Aug 08 '24 03:08 mxh1999

Hi, the .pkl file seems to contain an array size of (40, 40, 16) (for matterport3d) for each image. How should I match each image pixel's occupancy with the array?

chanhee-luke avatar Aug 09 '24 00:08 chanhee-luke

It seems that there's a misunderstanding about the definition of occupancy. Following TPVFormer, our occupancy is the semantic labels of dense voxels in 3D space.

mxh1999 avatar Aug 09 '24 04:08 mxh1999