sd-webui-inpaint-anything icon indicating copy to clipboard operation
sd-webui-inpaint-anything copied to clipboard

Cuda out of Memory Issue

Open bulutharbeli opened this issue 1 year ago • 10 comments

I used the smallest model (sam_vit_b_01ec64.pth) and I disabled all extensions but still get the same error. How to fix that ? Any idea ? Screenshot_31

bulutharbeli avatar May 28 '23 20:05 bulutharbeli

The current solution can be found in Issue #20. However, I will further investigate if there is a method to execute SAM utilizing less VRAM.

Uminosachi avatar May 29 '23 01:05 Uminosachi

What do you need at a minimum to run this extension, because I'm also running out of VRAM even running the base seg model. Also tried it by removing all other extensions I had on a clean setup of a1111.

EDIT:

I was able to finally get the segmentation to work on the base model. I had to reduce the image size by half...on top of that it still fails at the actual inpainting process with VRAM errors. My question is, why can I inpaint fine not using this extension, but I'm running out of VRAM when inpainting with the extension. Like does this inpaint the entire image when you're inpainting, or does it try to focus on a spot of 512x512 or 768x768 etc. I would love love to use this extension!

woofodus avatar May 29 '23 19:05 woofodus

The current solution can be found in Issue #20. However, I will further investigate if there is a method to execute SAM utilizing less VRAM.

Thank you for your response. But before writing the post I already tried the options you mentioned but it didn't help. Still getting the same error.

New edit: I have installed this version https://github.com/Uminosachi/inpaint-anything everything was look ok but when I tried to "run segment anything" it's show this error Screenshot_32

bulutharbeli avatar May 30 '23 06:05 bulutharbeli

Did you find a solution?

I have 6GB of VRAM by the way.

the sam_vit_b_01ec64.pth model seems to work with a 386x386 image though but the result is terrible

SamAutomaticMaskGenerator sam_vit_l_0b3195.pth Traceback (most recent call last): File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\gradio\routes.py", line 414, in run_predict output = await app.get_blocks().process_api( File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1323, in process_api result = await self.call_function( File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1051, in call_function prediction = await anyio.to_thread.run_sync( File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread return await future File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\anyio_backends_asyncio.py", line 807, in run result = context.run(func, *args) File "E:\media\A.I\A1111\stable-diffusion-webui\extensions\sd-webui-inpaint-anything\scripts\main.py", line 179, in run_sam sam_masks = sam_mask_generator.generate(input_image) File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\segment_anything\automatic_mask_generator.py", line 163, in generate mask_data = self._generate_masks(image) File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\segment_anything\automatic_mask_generator.py", line 206, in _generate_masks crop_data = self._process_crop(image, crop_box, layer_idx, orig_size) File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\segment_anything\automatic_mask_generator.py", line 236, in _process_crop self.predictor.set_image(cropped_im) File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\segment_anything\predictor.py", line 60, in set_image self.set_torch_image(input_image_torch, image.shape[:2]) File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\segment_anything\predictor.py", line 89, in set_torch_image self.features = self.model.image_encoder(input_image) File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\segment_anything\modeling\image_encoder.py", line 112, in forward x = blk(x) File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\segment_anything\modeling\image_encoder.py", line 174, in forward x = self.attn(x) File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\segment_anything\modeling\image_encoder.py", line 234, in forward attn = add_decomposed_rel_pos(attn, q, self.rel_pos_h, self.rel_pos_w, (H, W), (H, W)) File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\segment_anything\modeling\image_encoder.py", line 358, in add_decomposed_rel_pos attn.view(B, q_h, q_w, k_h, k_w) + rel_h[:, :, :, :, None] + rel_w[:, :, :, None, :] torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1024.00 MiB (GPU 0; 6.00 GiB total capacity; 4.33 GiB already allocated; 0 bytes free; 4.36 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

spamnco avatar May 31 '23 19:05 spamnco

Did you find a solution?

I have 6GB of VRAM by the way.

the sam_vit_b_01ec64.pth model seems to work with a 386x386 image though but the result is terrible

SamAutomaticMaskGenerator sam_vit_l_0b3195.pth Traceback (most recent call last): File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\gradio\routes.py", line 414, in run_predict output = await app.get_blocks().process_api( File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1323, in process_api result = await self.call_function( File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1051, in call_function prediction = await anyio.to_thread.run_sync( File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread return await future File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\anyio_backends_asyncio.py", line 807, in run result = context.run(func, *args) File "E:\media\A.I\A1111\stable-diffusion-webui\extensions\sd-webui-inpaint-anything\scripts\main.py", line 179, in run_sam sam_masks = sam_mask_generator.generate(input_image) File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\segment_anything\automatic_mask_generator.py", line 163, in generate mask_data = self._generate_masks(image) File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\segment_anything\automatic_mask_generator.py", line 206, in _generate_masks crop_data = self._process_crop(image, crop_box, layer_idx, orig_size) File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\segment_anything\automatic_mask_generator.py", line 236, in _process_crop self.predictor.set_image(cropped_im) File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\segment_anything\predictor.py", line 60, in set_image self.set_torch_image(input_image_torch, image.shape[:2]) File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\segment_anything\predictor.py", line 89, in set_torch_image self.features = self.model.image_encoder(input_image) File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\segment_anything\modeling\image_encoder.py", line 112, in forward x = blk(x) File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\segment_anything\modeling\image_encoder.py", line 174, in forward x = self.attn(x) File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\segment_anything\modeling\image_encoder.py", line 234, in forward attn = add_decomposed_rel_pos(attn, q, self.rel_pos_h, self.rel_pos_w, (H, W), (H, W)) File "E:\media\A.I\A1111\stable-diffusion-webui\venv\lib\site-packages\segment_anything\modeling\image_encoder.py", line 358, in add_decomposed_rel_pos attn.view(B, q_h, q_w, k_h, k_w) + rel_h[:, :, :, :, None] + rel_w[:, :, :, None, :] torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1024.00 MiB (GPU 0; 6.00 GiB total capacity; 4.33 GiB already allocated; 0 bytes free; 4.36 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Unfortunately no. I have tried so many way to fix it but nothing help. Still doesn't work.

bulutharbeli avatar Jun 01 '23 07:06 bulutharbeli

I guess 8GB of VRAM is the strict minimum (although not sure t is enough). If only the lowest model was able to correctly identify objects.

spamnco avatar Jun 01 '23 11:06 spamnco

I attempted to use a different method utilizing HuggingFace's transformers pipeline for the SAM and its execution, but I found that the VRAM usage was nearly identical to the current approach.

Uminosachi avatar Jun 01 '23 11:06 Uminosachi

I've added process to unload and reload the SD model that doesn't utilize within the function that handles SAM in v1.6.4. As a result, it should now be possible to process larger image sizes with SAM than was previously possible.

Uminosachi avatar Jun 08 '23 02:06 Uminosachi

I've added process to unload and reload the SD model that doesn't utilize within the function that handles SAM in v1.6.4. As a result, it should now be possible to process larger image sizes with SAM than was previously possible.

unfortunately still getting the same error. I am using image 512x512 but still the same :/

File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\gradio\routes.py", line 422, in run_predict output = await app.get_blocks().process_api( File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1323, in process_api result = await self.call_function( File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1051, in call_function prediction = await anyio.to_thread.run_sync( File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\anyio\to_thread.py", line 31, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread return await future File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 867, in run result = context.run(func, *args) File "E:\AI\stable-diffusion-webui\extensions\sd-webui-inpaint-anything\scripts\main.py", line 244, in run_sam sam_masks = sam_mask_generator.generate(input_image) File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\segment_anything\automatic_mask_generator.py", line 163, in generate mask_data = self._generate_masks(image) File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\segment_anything\automatic_mask_generator.py", line 206, in _generate_masks crop_data = self._process_crop(image, crop_box, layer_idx, orig_size) File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\segment_anything\automatic_mask_generator.py", line 236, in _process_crop self.predictor.set_image(cropped_im) File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\segment_anything\predictor.py", line 60, in set_image self.set_torch_image(input_image_torch, image.shape[:2]) File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\segment_anything\predictor.py", line 89, in set_torch_image self.features = self.model.image_encoder(input_image) File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\segment_anything\modeling\image_encoder.py", line 112, in forward x = blk(x) File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\segment_anything\modeling\image_encoder.py", line 174, in forward x = self.attn(x) File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\segment_anything\modeling\image_encoder.py", line 234, in forward attn = add_decomposed_rel_pos(attn, q, self.rel_pos_h, self.rel_pos_w, (H, W), (H, W)) File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\segment_anything\modeling\image_encoder.py", line 358, in add_decomposed_rel_pos attn.view(B, q_h, q_w, k_h, k_w) + rel_h[:, :, :, :, None] + rel_w[:, :, :, None, :] torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1024.00 MiB (GPU 0; 4.00 GiB total capacity; 3.29 GiB already allocated; 0 bytes free; 3.32 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

bulutharbeli avatar Jun 08 '23 07:06 bulutharbeli

I've added process to unload and reload the SD model that doesn't utilize within the function that handles SAM in v1.6.4. As a result, it should now be possible to process larger image sizes with SAM than was previously possible.

unfortunately still getting the same error. I am using image 512x512 but still the same :/

File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\gradio\routes.py", line 422, in run_predict output = await app.get_blocks().process_api( File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1323, in process_api result = await self.call_function( File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1051, in call_function prediction = await anyio.to_thread.run_sync( File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\anyio\to_thread.py", line 31, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread return await future File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 867, in run result = context.run(func, *args) File "E:\AI\stable-diffusion-webui\extensions\sd-webui-inpaint-anything\scripts\main.py", line 244, in run_sam sam_masks = sam_mask_generator.generate(input_image) File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\segment_anything\automatic_mask_generator.py", line 163, in generate mask_data = self._generate_masks(image) File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\segment_anything\automatic_mask_generator.py", line 206, in _generate_masks crop_data = self._process_crop(image, crop_box, layer_idx, orig_size) File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\segment_anything\automatic_mask_generator.py", line 236, in _process_crop self.predictor.set_image(cropped_im) File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\segment_anything\predictor.py", line 60, in set_image self.set_torch_image(input_image_torch, image.shape[:2]) File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\segment_anything\predictor.py", line 89, in set_torch_image self.features = self.model.image_encoder(input_image) File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\segment_anything\modeling\image_encoder.py", line 112, in forward x = blk(x) File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\segment_anything\modeling\image_encoder.py", line 174, in forward x = self.attn(x) File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\segment_anything\modeling\image_encoder.py", line 234, in forward attn = add_decomposed_rel_pos(attn, q, self.rel_pos_h, self.rel_pos_w, (H, W), (H, W)) File "E:\AI\stable-diffusion-webui\venv\lib\site-packages\segment_anything\modeling\image_encoder.py", line 358, in add_decomposed_rel_pos attn.view(B, q_h, q_w, k_h, k_w) + rel_h[:, :, :, :, None] + rel_w[:, :, :, None, :] torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1024.00 MiB (GPU 0; 4.00 GiB total capacity; 3.29 GiB already allocated; 0 bytes free; 3.32 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Finally it worked but only with Base model and 512x512 image. But thanks anyway :)

bulutharbeli avatar Jun 08 '23 08:06 bulutharbeli