SegAnyGAussians oom problem

ref_img_camera_id = 0
mask_img_camera_id = 0

view = cameras[ref_img_camera_id]
img = view.original_image * 255
img = cv2.resize(img.permute([1,2,0]).detach().cpu().numpy().astype(np.uint8),dsize=(1024,1024),fx=1,fy=1,interpolation=cv2.INTER_LINEAR)
predictor.set_image(img)
sam_feature = predictor.features
# sam_feature = view.original_features

start_time = time.time()
bg_color = [0 for i in range(FEATURE_DIM)]
background = torch.tensor(bg_color, dtype=torch.float32, device="cuda")
rendered_feature = render_contrastive_feature(view, feature_gaussians, pipeline.extract(args), background)['render']
time1 = time.time() - start_time

H, W = sam_feature.shape[-2:]

print(time1)
plt.imshow(img)

in this block when using prompt_segmenting i got this oom error how can i avoid oom problem?? my laptop is asus and rtx 2080super with 8gb vram

window 11, wsl2 ubuntu 22.04, anaconda envirioment

Details

OutOfMemoryError Traceback (most recent call last) Cell In[15], line 7 5 img = view.original_image * 255 6 img = cv2.resize(img.permute([1,2,0]).detach().cpu().numpy().astype(np.uint8),dsize=(1024,1024),fx=1,fy=1,interpolation=cv2.INTER_LINEAR) ----> 7 predictor.set_image(img) 8 sam_feature = predictor.features 9 # sam_feature = view.original_features

File ~/SuGaR/SegAnyGAussians/third_party/segment-anything/segment_anything/predictor.py:60, in SamPredictor.set_image(self, image, image_format) 57 input_image_torch = torch.as_tensor(input_image, device=self.device) 58 input_image_torch = input_image_torch.permute(2, 0, 1).contiguous()[None, :, :, :] ---> 60 self.set_torch_image(input_image_torch, image.shape[:2])

File ~/anaconda3/envs/seggau/lib/python3.9/site-packages/torch/utils/_contextlib.py:115, in context_decorator..decorate_context(*args, **kwargs) 112 @functools.wraps(func) 113 def decorate_context(*args, **kwargs): 114 with ctx_factory(): --> 115 return func(*args, **kwargs)

File ~/SuGaR/SegAnyGAussians/third_party/segment-anything/segment_anything/predictor.py:89, in SamPredictor.set_torch_image(self, transformed_image, original_image_size) 87 self.input_size = tuple(transformed_image.shape[-2:]) 88 input_image = self.model.preprocess(transformed_image) ---> 89 self.features = self.model.image_encoder(input_image) 90 self.is_image_set = True

File ~/anaconda3/envs/seggau/lib/python3.9/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs) 1496 # If we don't have any hooks, we want to skip the rest of the logic in 1497 # this function, and just call forward. 1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks 1499 or _global_backward_pre_hooks or _global_backward_hooks 1500 or _global_forward_hooks or _global_forward_pre_hooks): -> 1501 return forward_call(*args, **kwargs) 1502 # Do not call functions when jit is used 1503 full_backward_hooks, non_full_backward_hooks = [], []

File ~/SuGaR/SegAnyGAussians/third_party/segment-anything/segment_anything/modeling/image_encoder.py:112, in ImageEncoderViT.forward(self, x) 109 x = x + self.pos_embed 111 for blk in self.blocks: --> 112 x = blk(x) 114 x = self.neck(x.permute(0, 3, 1, 2)) 116 return x

File ~/anaconda3/envs/seggau/lib/python3.9/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs) 1496 # If we don't have any hooks, we want to skip the rest of the logic in 1497 # this function, and just call forward. 1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks 1499 or _global_backward_pre_hooks or _global_backward_hooks 1500 or _global_forward_hooks or _global_forward_pre_hooks): -> 1501 return forward_call(*args, **kwargs) 1502 # Do not call functions when jit is used 1503 full_backward_hooks, non_full_backward_hooks = [], []

File ~/SuGaR/SegAnyGAussians/third_party/segment-anything/segment_anything/modeling/image_encoder.py:174, in Block.forward(self, x) 171 H, W = x.shape[1], x.shape[2] 172 x, pad_hw = window_partition(x, self.window_size) --> 174 x = self.attn(x) 175 # Reverse window partition 176 if self.window_size > 0:

File ~/anaconda3/envs/seggau/lib/python3.9/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs) 1496 # If we don't have any hooks, we want to skip the rest of the logic in 1497 # this function, and just call forward. 1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks 1499 or _global_backward_pre_hooks or _global_backward_hooks 1500 or _global_forward_hooks or _global_forward_pre_hooks): -> 1501 return forward_call(*args, **kwargs) 1502 # Do not call functions when jit is used 1503 full_backward_hooks, non_full_backward_hooks = [], []

File ~/SuGaR/SegAnyGAussians/third_party/segment-anything/segment_anything/modeling/image_encoder.py:234, in Attention.forward(self, x) 231 attn = (q * self.scale) @ k.transpose(-2, -1) 233 if self.use_rel_pos: --> 234 attn = add_decomposed_rel_pos(attn, q, self.rel_pos_h, self.rel_pos_w, (H, W), (H, W)) 236 attn = attn.softmax(dim=-1) 237 x = (attn @ v).view(B, self.num_heads, H, W, -1).permute(0, 2, 3, 1, 4).reshape(B, H, W, -1)

File ~/SuGaR/SegAnyGAussians/third_party/segment-anything/segment_anything/modeling/image_encoder.py:358, in add_decomposed_rel_pos(attn, q, rel_pos_h, rel_pos_w, q_size, k_size) 354 rel_h = torch.einsum("bhwc,hkc->bhwk", r_q, Rh) 355 rel_w = torch.einsum("bhwc,wkc->bhwk", r_q, Rw) 357 attn = ( --> 358 attn.view(B, q_h, q_w, k_h, k_w) + rel_h[:, :, :, :, None] + rel_w[:, :, :, None, :] 359 ).view(B, q_h * q_w, k_h * k_w) 361 return attn

OutOfMemoryError: CUDA out of memory. Tried to allocate 1024.00 MiB (GPU 0; 8.00 GiB total capacity; 5.93 GiB already allocated; 0 bytes free; 7.19 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Mar 07 '24 12:03 hanjoonwon

In such situation you may have to reduce the rendering resolution of both image and mask. But I think 8GB seems too small for these codes. You may also need to use a much smaller SAM model for extracting masks, features and inference.

Mar 07 '24 13:03 Jumpat

In such situation you may have to reduce the rendering resolution of both image and mask. But I think 8GB seems too small for these codes. You may also need to use a much smaller SAM model for extracting masks, features and inference.

Thankyou use smaller pretrained vit model?

Mar 08 '24 05:03 hanjoonwon

In such situation you may have to reduce the rendering resolution of both image and mask. But I think 8GB seems too small for these codes. You may also need to use a much smaller SAM model for extracting masks, features and inference.

Thankyou use smaller pretrained vit model?

Yeah, but this may still not solve this problem totally.

Mar 11 '24 06:03 Jumpat