sd-webui-segment-anything [Bug]: GroundingDINO doesn't respect `--device-id` flag

Is there an existing issue for this?

[X] I have searched the existing issues and checked the recent builds/commits of both this extension and the webui

Have you updated WebUI and this extension to the newest version?

[X] I have updated WebUI and this extension to the most up-to-date version

Do you understand that you should go to https://github.com/IDEA-Research/Grounded-Segment-Anything/issues if you cannot install GroundingDINO?

[X] My problem is not about installing GroundingDINO

Do you know that you should use the newest ControlNet extension and enable external control if you want SAM extension to control ControlNet?

[X] I have updated ControlNet extension and enabled "Allow other script to control this extension"

What happened?

GroundingDINO always access GPU 0 even if --device-id is set to non-zero value, and trigger illegal memory access CUDA error when you generate bounding box again.

Steps to reproduce the problem

Start WebUI on multi-GPU server with non-zero GPU ID, such as ./webui.sh --device-id 1
Check Enable GroundingDINO
Select model, enter some prompts
Check I want to preview GroundingDINO detection result and select the boxes I want.
Click Generate bounding box
Wait until finished
Click Generate bounding box again
You should notice error logs in terminal RuntimeError: CUDA error: an illegal memory access was encountered
Run nvidia-smi in another terminal, you should notice a process named python3 using both GPU 0 and the one you specified in step 1.

What should have happened?

GroundingDINO should not access GPU 0 at any moment.

Commit where the problem happens

webui: 22bcc7be428c94e9408f589966c2040187245d81 extension: 724b4db6

What browsers do you use to access the UI ?

Google Chrome

Command Line Arguments

cmdline:

./webui.sh -f --listen --device-id 7

modified webui-user.sh:

install_dir="/mnt"

I'm running WebUI inside a docker container with:

docker run --name stable-diffusion -it --runtime nvidia --gpus all --ipc host -v ${HOME}:/mnt -p 7860:7860 pytorch/pytorch:1.13.1-cuda11.6-cudnn8-devel

Console logs

Launching Web UI with arguments: -f --listen --device-id 3
No module 'xformers'. Proceeding without it.
Loading weights [1a189f0be6] from /mnt/stable-diffusion-webui/models/Stable-diffusion/sdv1-5-pruned.safetensors
Creating model from config: /mnt/stable-diffusion-webui/configs/v1-inference.yaml
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Applying cross attention optimization (Doggettx).
Textual inversion embeddings loaded(0):
Model loaded in 2.1s (load weights from disk: 0.6s, create model: 0.4s, apply weights to model: 0.2s, apply half(): 0.2s, load VAE: 0.2s, move model to device: 0.4s).
Running on local URL:  http://0.0.0.0:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 9.4s (import torch: 1.0s, import gradio: 1.1s, import ldm: 1.4s, other imports: 1.9s, load scripts: 1.1s, load SD checkpoint: 2.2s, create ui: 0.5s, gradio launch: 0.1s).
Start SAM Processing
Running GroundingDINO Inference
Initializing GroundingDINO GroundingDINO_SwinB (938MB)
final text_encoder_type: bert-base-uncased
/opt/conda/lib/python3.10/site-packages/transformers/modeling_utils.py:768: FutureWarning: The `device` argument is deprecated and will be removed in v5 of Transformers.
  warnings.warn(
Initializing SAM
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/site-packages/gradio/routes.py", line 394, in run_predict
    output = await app.get_blocks().process_api(
  File "/opt/conda/lib/python3.10/site-packages/gradio/blocks.py", line 1075, in process_api
    result = await self.call_function(
  File "/opt/conda/lib/python3.10/site-packages/gradio/blocks.py", line 884, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/opt/conda/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/opt/conda/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/opt/conda/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/mnt/stable-diffusion-webui/extensions/sd-webui-segment-anything/scripts/sam.py", line 161, in sam_predict
    sam = init_sam_model(sam_model_name)
  File "/mnt/stable-diffusion-webui/extensions/sd-webui-segment-anything/scripts/sam.py", line 130, in init_sam_model
    sam_model_cache[sam_model_name] = load_sam_model(sam_model_name)
  File "/mnt/stable-diffusion-webui/extensions/sd-webui-segment-anything/scripts/sam.py", line 56, in load_sam_model
    sam.to(device=device)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 989, in to
    return self._apply(convert)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 641, in _apply
    module._apply(fn)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 641, in _apply
    module._apply(fn)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 641, in _apply
    module._apply(fn)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 664, in _apply
    param_applied = fn(param)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 987, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Additional information

Generated by neofetch on host machine:

OS: Ubuntu 20.04.5 LTS x86_64
Host: X660 G45 Whitley
Kernel: 5.4.0-147-generic
Uptime: 6 hours, 5 mins
Packages: 1199 (dpkg), 4 (snap)
Shell: zsh 5.8
Resolution: 1024x768
Terminal: /dev/pts/3
CPU: Intel Xeon Platinum 8369C (128) @ 3.500GHz
GPU: NVIDIA 8e:00.0 NVIDIA Corporation Device 20b2
GPU: NVIDIA 56:00.0 NVIDIA Corporation Device 20b2
GPU: NVIDIA e8:00.0 NVIDIA Corporation Device 20b2
GPU: NVIDIA 8a:00.0 NVIDIA Corporation Device 20b2
GPU: NVIDIA eb:00.0 NVIDIA Corporation Device 20b2
GPU: NVIDIA 6b:00.0 NVIDIA Corporation Device 20b2
GPU: NVIDIA 71:00.0 NVIDIA Corporation Device 20b2
GPU: NVIDIA 51:00.0 NVIDIA Corporation Device 20b2
Memory: 26134MiB / 1031335MiB

Apr 21 '23 09:04 RangerCD

This is SAM's error, and I am unfortunately unable to help you because I do not have access to multiple GPU. Please post your question at SAM repository: https://github.com/facebookresearch/segment-anything

Apr 21 '23 20:04 continue-revolution

Delete it “--listen” try again

Apr 22 '23 05:04 cdmusic2019

This is SAM's error, and I am unfortunately unable to help you because I do not have access to multiple GPU. Please post your question at SAM repository: https://github.com/facebookresearch/segment-anything

@continue-revolution I don't know much about implementation detail of SAM. So I decide to bypassing this issue by passing only the GPU I want for each container, which makes process think it's a single GPU environment, and I don't have to specify --device-id.

Apr 23 '23 02:04 RangerCD

Delete it “--listen” try again

@cdmusic2019 I don't think --listen has anything to do with this issue, the only purpose of this flag is to accept remote connection, see here. And also I can confirm --device-id has been passed to WebUI correctly, other components do respect this flag.

Apr 23 '23 02:04 RangerCD

sd-webui-segment-anything sd-webui-segment-anything copied to clipboard

[Bug]: GroundingDINO doesn't respect `--device-id` flag

Is there an existing issue for this?

Have you updated WebUI and this extension to the newest version?

Do you understand that you should go to https://github.com/IDEA-Research/Grounded-Segment-Anything/issues if you cannot install GroundingDINO?

Do you know that you should use the newest ControlNet extension and enable external control if you want SAM extension to control ControlNet?

What happened?

Steps to reproduce the problem

What should have happened?

Commit where the problem happens

What browsers do you use to access the UI ?

Command Line Arguments

Console logs

Additional information

sd-webui-segment-anything
sd-webui-segment-anything copied to clipboard