SegmentAnythingin3D icon indicating copy to clipboard operation
SegmentAnythingin3D copied to clipboard

Runtime Error on MipNeRF-360 dataset

Open 2085924055 opened this issue 1 year ago • 23 comments

When I run the following commond : Python run.py --config=configs/llff/kitchen.py --stop_at=20000 --render_video --i_weights=10000

I get this error : File "G:\SegmentAnythingin3D-master\lib\grid.py", line 171, in init self.xy_plane = nn.Parameter(torch.randn([1, Rxy, X, Y]) * 0.1) RuntimeError: CUDA out of memory. Tried to allocate 2356.25 GiB (GPU 0; 23.99 GiB total capacity; 25.00 KiB already allocated; 22.04 GiB free; 2.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

The error occurs on line 171 of the lib\grid.py file, where an attempt is made to allocate memory to self.xy_plane. It seems to be trying to use random initialization to create a tensor of the shape [1, Rxy, X, Y], but the allocated memory size is unusually large.

Did anyone face this issue ?

2085924055 avatar Feb 26 '24 08:02 2085924055

Hi! Did you change any code? The '2356.25 GiB' to be allocated seems a little weird. This seems to be caused by an unexpected broadcast operation.

Jumpat avatar Feb 26 '24 09:02 Jumpat

Thank you for your reply. I did not change the code. I found that there should be a problem with the command I used, I should use the following command. Python run.py --config=configs/nerf_unbounded/kitchen.py --stop_at=20000 --render_video --i_weights=10000 Although I don't know why the above command has this problem, I think what I want is the effect of the following command.

2085924055 avatar Mar 02 '24 09:03 2085924055

Which command should be used if running MipNeRF-360 dataset?llff/kitchen.py or nerf_unbounded/kitchen.py I don't really understand

2085924055 avatar Mar 02 '24 09:03 2085924055

When I run the following commond : Python run.py --config=configs/nerf_unbounded/kitchen.py --stop_at=20000 --render_video --i_weights=10000

I get this error : File "G:\SegmentAnythingin3D-master\run.py", line 690, in train(args, cfg, data_dict) File "G:\SegmentAnythingin3D-master\run.py", line 621, in train scene_rep_reconstruction( File "G:\SegmentAnythingin3D-master\run.py", line 520, in scene_rep_reconstruction loss_distortion = flatten_eff_distloss(w, s, 1/n_max, ray_id) File "D:\Anaconda3\envs\SA3D\lib\site-packages\torch_efficient_distloss\eff_distloss.py", line 93, in forward segment_cumsum_cuda = load( File "D:\Anaconda3\envs\SA3D\lib\site-packages\torch\utils\cpp_extension.py", line 1202, in load return _jit_compile( File "D:\Anaconda3\envs\SA3D\lib\site-packages\torch\utils\cpp_extension.py", line 1450, in _jit_compile return _import_module_from_library(name, build_directory, is_python_module) File "D:\Anaconda3\envs\SA3D\lib\site-packages\torch\utils\cpp_extension.py", line 1844, in _import_module_from_library module = importlib.util.module_from_spec(spec) File "", line 571, in module_from_spec File "", line 1176, in create_module File "", line 241, in _call_with_frames_removed ImportError: DLL load failed while importing segment_cumsum_cuda: 找不到指定的模块。

The above problem occurs when the above command is run on the 360_v2 data set. In nerf_llff_data data can run properly and generate Fly-through videos. So I don't really understand why this is happening, doesn't the fact that it works on the nerf_llff_data dataset mean that the environment is good? Why can't I find cuda extension module?

2085924055 avatar Mar 02 '24 10:03 2085924055

Which command should be used if running MipNeRF-360 dataset?llff/kitchen.py or nerf_unbounded/kitchen.py I don't really understand

You may need to use the nerf_unbounded config as the MIP360 dataset involves several unbounded in-the-wild scenes.

Here I found some similar issues with your missed module problem. You can check whether they can help you. In practice we have never met such problem before.

Jumpat avatar Mar 04 '24 02:03 Jumpat

Thank you very much for your reply.

2085924055 avatar Mar 04 '24 06:03 2085924055

When I run the following commond : Python run_seg_gui.py --config=configs/nerf_unbounded/seg_kitchen.py --segment --sp_name=_gui --num_prompts=20 --render_opt=train --save_ckpt

I get this error : Traceback (most recent call last): File "E:\pycode\SegmentAnythingin3D\run_seg_gui.py", line 106, in train_seg(args, cfg, data_dict) File "E:\pycode\SegmentAnythingin3D\run_seg_gui.py", line 55, in train_seg gui.run() File "E:\pycode\SegmentAnythingin3D\lib\gui.py", line 58, in run init_rgb = self.Seg3d.init_model() File "E:\pycode\SegmentAnythingin3D\lib\sam3d.py", line 89, in init_model assert reload_ckpt_path is not None and 'segmentation must based on a pretrained NeRF' AssertionError

The error you're encountering is an AssertionError, which means that a specific assertion condition failed in the code. In this case, the error occurred in the init_model function within the sam3d.py file at line 89. The assertion condition assert reload_ckpt_path is not None and 'segmentation must based on a pretrained NeRF' comprises two parts: (1) reload_ckpt_path is not None: This part requires that the reload_ckpt_path variable is not empty. (2) 'segmentation must based on a pretrained NeRF': This is part of the error message and indicates that the segmentation task must be based on a pretrained NeRF model.

I don't know what to do to solve this problem.

2085924055 avatar Mar 05 '24 03:03 2085924055

Have you run the run.py successfully to get the pertained NeRF model? Maybe you can check whether the reload_ckpt_path has the corresponding NeRF model (like fine_last.tar).

Jumpat avatar Mar 05 '24 08:03 Jumpat

yes. 1709629694024

2085924055 avatar Mar 05 '24 09:03 2085924055

1709629617836

2085924055 avatar Mar 05 '24 09:03 2085924055

I guess this is caused by the missing 'c' in the config file 'seg_kitchen.py'. I mean the expname = 'dvgo_kitchen_unbounded' should be 'dcvgo_kitchen_unbounded'.

Jumpat avatar Mar 05 '24 09:03 Jumpat

yes, thank you very much for your reply. Now it's ready to run.

2085924055 avatar Mar 05 '24 09:03 2085924055

hello,I would like to ask where the whole model framework is and how to understand it. There seems to be no clear framework for NeRF in the entire code. What should I do if I want to modify NeRF.

2085924055 avatar Mar 19 '24 07:03 2085924055

Hi! You can find the code about NeRF in lib/dvgo.py (dcvgo, seg_dvgo, ...)

hello,I would like to ask where the whole model framework is and how to understand it. There seems to be no clear framework for NeRF in the entire code. What should I do if I want to modify NeRF.

Jumpat avatar Mar 19 '24 11:03 Jumpat

thanks, I get it. I will be careful to understand the code logic. One question I would like to ask is how to get the code to run on my computer if there is a lack of video memory when running on my computer. Because I don't see where the batch_size can be adjusted. I originally ran it on a different server. But now I want to transfer the code to my computer, which is more convenient.

2085924055 avatar Mar 21 '24 04:03 2085924055

There is no batch size in SA3D. You can reduce the resolution of TensoRF (mask grids resolution, TensoRF grids resolution, rendering resolution, density grids resolution, ...) for saving memory.

Jumpat avatar Mar 21 '24 11:03 Jumpat

hello, I would like to ask if the parameters in the code are already optimal? Do you still need hyperparameter optimization?

2085924055 avatar Apr 14 '24 12:04 2085924055

hello, I would like to ask if the parameters in the code are already optimal? Do you still need hyperparameter optimization?

For some scenes and targets it is. However it depends on the concrete scene and target you choose.

Jumpat avatar Apr 15 '24 01:04 Jumpat

Ok, thank you very much for your reply.

2085924055 avatar Apr 15 '24 02:04 2085924055

hello, I would like to ask which NeRF article is based on ?

2085924055 avatar Apr 15 '24 13:04 2085924055

hello, I would like to ask which NeRF article is based on ?

2085924055 avatar Apr 15 '24 13:04 2085924055

hello, I would like to ask which NeRF article is based on ?

The main branch of SA3D is based on TensoRF.

NerfStudio branch is based on Nerfecto.

SA3D-GS branch is based on 3D-GS.

Jumpat avatar Apr 16 '24 01:04 Jumpat

ok, thanks for you reply.

2085924055 avatar Apr 16 '24 01:04 2085924055