Vox-Fusion
Vox-Fusion copied to clipboard
CUDA out of memory.
I am trying to run scannet/scene0059, but got cuda out of memory error. Here is the error message:
home/slam/.conda/envs/ngp_pl/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/home/slam/.conda/envs/ngp_pl/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/Vox-Fusion/src/tracking.py", line 97, in spin
self.do_tracking(share_data, current_frame, kf_buffer)
File "/Vox-Fusion/src/tracking.py", line 128, in do_tracking
frame_pose, hit_mask = track_frame(
File "/Vox-Fusion/src/variations/render_helpers.py", line 450, in track_frame
final_outputs = render_rays(
File "/Vox-Fusion/src/variations/render_helpers.py", line 223, in render_rays
samples = ray_sample(intersections, step_size=step_size)
File "/home/slam/.conda/envs/ngp_pl/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/Vox-Fusion/src/variations/voxel_helpers.py", line 575, in ray_sample
sampled_idx, sampled_depth, sampled_dists = inverse_cdf_sampling(
File "/Vox-Fusion/src/variations/voxel_helpers.py", line 292, in forward
noise = min_depth.new_zeros(*min_depth.size()[:-1], max_steps)
RuntimeError: CUDA out of memory. Tried to allocate 745.06 GiB (GPU 0; 23.70 GiB total capacity; 146.05 MiB already allocated; 11.93 GiB free; 176.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
[W CudaIPCTypes.cpp:15] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]
^CTraceback (most recent call last):
File "demo/run.py", line 23, in <module>
slam.wait_child_processes()
File "/Vox-Fusion/src/voxslam.py", line 62, in wait_child_processes
p.join()
File "/home/slam/.conda/envs/ngp_pl/lib/python3.8/multiprocessing/process.py", line 149, in join
res = self._popen.wait(timeout)
File "/home/slam/.conda/envs/ngp_pl/lib/python3.8/multiprocessing/popen_fork.py", line 47, in wait
return self.poll(os.WNOHANG if timeout == 0.0 else 0)
File "/home/slam/.conda/envs/ngp_pl/lib/python3.8/multiprocessing/popen_fork.py", line 27, in poll
pid, sts = os.waitpid(self.pid, flag)
KeyboardInterrupt
Process Process-2:
Traceback (most recent call last):
File "/home/slam/.conda/envs/ngp_pl/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/home/slam/.conda/envs/ngp_pl/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/Vox-Fusion/src/mapping.py", line 89, in spin
if not kf_buffer.empty():
File "/home/slam/.conda/envs/ngp_pl/lib/python3.8/multiprocessing/queues.py", line 123, in empty
return not self._poll()
File "/home/slam/.conda/envs/ngp_pl/lib/python3.8/multiprocessing/connection.py", line 257, in poll
return self._poll(timeout)
File "/home/slam/.conda/envs/ngp_pl/lib/python3.8/multiprocessing/connection.py", line 424, in _poll
r = wait([self], timeout)
File "/home/slam/.conda/envs/ngp_pl/lib/python3.8/multiprocessing/connection.py", line 925, in wait
selector.register(obj, selectors.EVENT_READ)
File "/home/slam/.conda/envs/ngp_pl/lib/python3.8/selectors.py", line 352, in register
key = super().register(fileobj, events, data)
File "/home/slam/.conda/envs/ngp_pl/lib/python3.8/selectors.py", line 235, in register
if (not events) or (events & ~(EVENT_READ | EVENT_WRITE)):
KeyboardInterrupt
/home/slam/.conda/envs/ngp_pl/lib/python3.8/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 3 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
I got the same problem! Hope for the reply.
This problem might need to be diagnosed with more intermediate results. How are the predicted color and depth maps? (generated with the render_freq
option)
Now I dont' have the predicted color and depth maps, hope @Xiaxia1997 can provide more information.
I can provide my found: I just print *min_depth.size()[:-1], max_steps
and I found max_steps
is huge, like 8*1e8. I check the source code, it might be the problem of max_distance
and min_distance
here.
I encounter this error each time there is a loop during the tracking. It seems the ray intersects with very far voxels, causing the max distance to be very big.
I wonder how this problem can be solved. I find the max_depth
in the config, maybe the voxel that exceeds the max_depth value should be ignored?