Questions about speed?
Thanks for your interesting work! However, I found that I run 3d_prompt_proposal.py and main.py too slow for even one scene on ScanNet(with VIT base > 1h on 1 V100). It is normal?
Thanks for your interest in our work.
Generally, for a single scene on ScanNet, it will take 10~15 minutes for running 3d_prompt_proposal.py.
So, the major issue may due to your hardware, not just the graphics card, but may mainly caused by your CPU or especailly the disk.
Since our code requires to keep reading the small file (rgb frames) from the disk, I recommend you to store the rgb frames or other data on the SSD, and make sure that your CPU is also in good status.
Also, you can try to use less frames by adjusting the frame gap, but may degrade the segmentation performance.
Hope this is helpful!
Thanks!
I’ve stored the color and depth files as your code specifies.
The key factor may be the number of frames in the ScanNet dataset, which ranges from over 1,000 to more than 5,000 frames. I can process around 1,000 frames in approximately 15 minutes, but it takes about an hour to process 5,000 frames.
Considering the significant overlap of image frames, adjusting the frame gap based on the total number of frames might help. Unfortunately, there is no evaluation code for calculating mIoU, so the loss cannot be directly interpreted. Do you have any insights on how to select the optimal number of frames?
Hi,
Skipping 5 frames would be OK to produce a fair result. For 10, 20 would somehow degrades the performance.
Also, as for how to evaluate the result, currently I can only suggest you to check the result visualizations by yourself. According to my experience, skipping 5 frames will also produce a visually-good result.
Thanks for your reply!
For visualization, I’m unable to visualize all scenes and review them in detail, which might make it challenging to identify key factors.
Additionally, I am working on leveraging CLIP and SAM to obtain standard semantic segmentation results. Your aggregation method has truly been an inspiration. Thanks again!
Great work, I found it really interesting!
Another comment, I am running this on my own data. Maybe I should've downsized the point cloud a bit. Regardless, at the end when running main.py I have a pretty big point cloud + lots of init_prompts even after prompt_filtering. I saw there was a lot of memory intensive code written in perform_3dsegmentation. I would suggest future users to optimize that code if they are running on large amounts of data. Tensors are put on cuda even when they don't need to be and multiple copies of large tensors on cuda device are created. I can propose a PR in the future but for current users just FYI. Thanks!