rembg icon indicating copy to clipboard operation
rembg copied to clipboard

[BUG] This module is not using GPU at all.

Open KVignesh122 opened this issue 1 year ago • 10 comments

rembg_session = rembg.new_session()
rembg.remove(data=Image.open(input_path), session=rembg_session, only_mask=True)

rembg_session.inner_session.get_providers() # prints out ['CPUExecutionProvider']

I also saw that there were no spikes at all on my GPU RAM graph.

I am running this code on GColab with T4 runtime, here is the link to a project [Video Background Removal] I have created for my non-tech colleagues: https://colab.research.google.com/drive/16AslpibFerebpJXULY0C8oCPH8Dqrvim?usp=sharing

I have GColab pro and tried it on other GPUs too, but rembg did not use any GPU at all.

KVignesh122 avatar May 16 '24 13:05 KVignesh122

The same problem here. I've installed onnxruntime-gpu and rembg[gpu] successfully, but it doesn't use gpu. Maybe we should run it in some other way than just rembg i ...?

ProgrammingLife avatar May 21 '24 08:05 ProgrammingLife

I think this is an issue with how onnxruntime-gpu installs.

Try installing your usual dependencies and then afterwards run pip install --force-reinstall onnxruntime-gpu

jalsop24 avatar May 22 '24 10:05 jalsop24

I have tried all possible ways actually.

1. Pip Installing rembg automatically disables support for GPU Installing rembg[gpu] automatically installs both onnxruntime-gpu AND onnxruntime. I read somewhere that maybe this may be causing a conflict of packages issue and in turn disabling GPU support.

For instance, if I pip install rembg[gpu] as suggested and run this code:

import torch
import onnxruntime as ort
print(ort.get_available_providers()) # Available providers are ['AzureExecutionProvider', 'CPUExecutionProvider']

So I pip installed the base rembg[gpu] without any of the other dependencies, and then manually installed all the dependencies without onnxruntime:

pip install onnxruntime-gpu
pip install rembg[gpu] --no-deps
pip install jsonschema numpy opencv-python-headless pillow pooch pymatting scikit-image scipy tqdm

Doing this shows that

import torch
import onnxruntime as ort
print(ort.get_available_providers()) # Available providers are ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'AzureExecutionProvider', 'CPUExecutionProvider']

So clearly, normal pip installing rembg[gpu] naturally removes 'CUDAExecutionProvider' to begin with and falls back to CPU support only... Same result if I pip install --force-reinstall onnxruntime-gpu @jalsop24 :(

2. Tried using torch 2.1 and CUDA 11 Understanding this, I also read that maybe there is a problem with CUDA12 support in PyTorch version 2.2.x so I forced the program to run on PyTorch 2.1.2 with CUDA version 11.8 instead:

pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu118
pip install nvidia-cudnn-cu11
pip install tensorrt-cu11
import torch
print(torch.__version__) # Prints out 2.1.2+cu118

But, nope still no GPU usage.

3. Hardcoded 'CUDAExecutionProvider' into rembg_session.inner_session._providers Tried this too:

rembg_session = rembg.new_session()
rembg_session.inner_session._providers = ['CUDAExecutionProvider']
print(rembg_session.inner_session.get_providers())

Ouput:

*************** EP Error ***************
EP Error /onnxruntime_src/onnxruntime/python/onnxruntime_pybind_state.cc:456 void onnxruntime::python::RegisterTensorRTPluginsAsCustomOps(onnxruntime::python::PySessionOptions&, const ProviderOptions&) Please install TensorRT libraries as mentioned in the GPU requirements page, make sure they're in the PATH or LD_LIBRARY_PATH, and that your GPU is supported.
 when using ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'AzureExecutionProvider', 'CPUExecutionProvider']
Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying.
****************************************
['CUDAExecutionProvider']

The EP Error is not making any sense also, because I already installed the "TensorRT libraries as mentioned in the GPU requirements page" but anyways it falls back to ['CUDAExecutionProvider', 'CPUExecutionProvider']. But somehow, during runtime, the provider falls back to 'CPUExecutionProvider' only once again and no GPU usage. I tried to inspect all the codefiles on how the provider is selected but I still could not figure out this issue...

KVignesh122 avatar May 22 '24 13:05 KVignesh122

same issue

Bouts2019 avatar May 26 '24 08:05 Bouts2019

Same issue here. Checked the installation matrix and installed the provided wheels for Jetson from the Jetson Zoo. Then i pip installed rembg[gpu], which as dependency has onnxruntime, so it installed also the cpu package. I tried to pip uninstall the onnxruntime package and to force reinstall the onnxruntime-gpu but then rembg gets lost in a loop by the import.

So only CPU is working for Jetson

Bonitodelcapo avatar May 27 '24 16:05 Bonitodelcapo

Here is my solution:

Based On @KVignesh122 's works. thank you buddy!

# based on cuda 11.8, other versions may have compatibility issues
# for cudnn version check https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements
# and https://docs.nvidia.com/deeplearning/cudnn/archives/cudnn-895/install-guide/index.html
sudo apt-get install libcudnn8=8.9.2.26-1+cuda11.8 -y
# directly installing rembg[gpu] will cause dependency issues
# so install it separately
pip3 install onnxruntime-gpu==1.18.0
pip3 install rembg[gpu]==2.0.50 --no-deps
pip3 install numpy opencv-python-headless pillow pooch pymatting scikit-image scipy tqdm

# test & enjoy
python3 -c "from rembg import remove, new_session; from PIL import Image; output = remove(Image.open('i.png'), session=new_session('u2net', ['CUDAExecutionProvider'])); output.save('o.png')"

Mr47hsy avatar May 29 '24 03:05 Mr47hsy

print(new_session('u2net', ['CUDAExecutionProvider'])).inner_session.get_providers())
# You still get ['CPUExecutionProvider']

Solution still doesn't work buddy @Mr47hsy 😭

KVig122 avatar May 29 '24 12:05 KVig122

@KVig122 hi , it was worked in my side: image

Try to identify the specific cause by:

import  onnxruntime as ort

print(f"onnxruntime device: {ort.get_device()}") # output: GPU
print(f'ort avail providers: {ort.get_available_providers()}') # output: ['CUDAExecutionProvider', 'CPUExecutionProvider']

ort_session = ort.InferenceSession('/root/.u2net/u2net.onnx', providers=["CUDAExecutionProvider"])
print(ort_session.get_providers())

Mr47hsy avatar May 29 '24 13:05 Mr47hsy

Same error here. Excellent results, but CPU only.

WindowsNT avatar Jun 04 '24 00:06 WindowsNT

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Jul 04 '24 01:07 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar Jul 18 '24 01:07 github-actions[bot]

I had same issues with the docker instance, it would not use CUDA, it now complains about Tensors still however by deploying this https://hub.docker.com/r/unsgn/onnxruntime-cuda docker instance with bash and then run pip install rembg[gpu,cli] and launching rembg s --port 7000 download models and have great CUDA performance.

Use ONNX Runtime with CUDA base image

FROM unsgn/onnxruntime-cuda

Install system dependencies

RUN apt-get update && apt-get install -y
python3-pip
wget
libglib2.0-0
libsm6
libxext6
libxrender1
&& pip3 install --upgrade pip

Install specific versions of ONNX and ONNX Runtime with GPU support - may not be needed however this is suggested best stable for CUDA 12.6

RUN python3 -m pip install onnxruntime==1.16.3 onnxruntime-gpu==1.16.3

Install rembg with GPU and CLI support

RUN pip3 install rembg[gpu,cli]

Set working directory for rembg

WORKDIR /app

RUN rembg d

Expose port 7000 for the rembg API (optional)

EXPOSE 7000

Set the command to run the rembg CLI tool in the container

ENTRYPOINT ["rembg"] CMD ["s"]

akadata avatar Nov 14 '24 14:11 akadata

I'm also having the same issue.. Any work around?

jitendra-koodo avatar Jan 15 '25 16:01 jitendra-koodo

I also have the same issue

RyanHangZhou avatar Jan 16 '25 05:01 RyanHangZhou

I'm getting following error with gpu option on Windows 11 : can anyone help with the issue and resolution? tensorrt module is installed in venv. PATH is also set with correct CUDA lib dir.

$ rembg i --model u2net frame_0001.jpg frame_0001_processed.png

2025-01-16 10:08:44.4374789 [E:onnxruntime:Default, provider_bridge_ort.cc:1848 onnxruntime::TryGetProviderInfo_TensorRT] D:\a_work\1\s\onnxruntime\core\session\provider_bridge_ort.cc:1539 onnxruntime::ProviderLibrary::Get [ONNXRuntimeError] : 1 : FAIL : LoadLibrary failed with error 126 "" when trying to load "E:\python\venv\Lib\site-packages\onnxruntime\capi\onnxruntime_providers_tensorrt.dll"

2025-01-16 10:08:44.5573621 [E:onnxruntime:Default, provider_bridge_ort.cc:1862 onnxruntime::TryGetProviderInfo_CUDA] D:\a_work\1\s\onnxruntime\core\session\provider_bridge_ort.cc:1539 onnxruntime::ProviderLibrary::Get [ONNXRuntimeError] : 1 : FAIL : LoadLibrary failed with error 126 "" when trying to load "E:\python\venv\Lib\site-packages\onnxruntime\capi\onnxruntime_providers_cuda.dll"

*************** EP Error *************** EP Error D:\a_work\1\s\onnxruntime\python\onnxruntime_pybind_state.cc:507 onnxruntime::python::RegisterTensorRTPluginsAsCustomOps Please install TensorRT libraries as mentioned in the GPU requirements page, make sure they're in the PATH or LD_LIBRARY_PATH, and that your GPU is supported. when using ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider'] Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying.

jitendra-koodo avatar Jan 16 '25 05:01 jitendra-koodo

After fixing Cuda issues, rembg seems to use GPU .. but I'm seeing a strange behavior:

GPU is taking more than 20X time than cpu to process one image with the same model.

With 100% GPU, it takes 42 sec With no GPU, 10% CPU - it takes 2-3 seconds.

image

jitendra-koodo avatar Jan 16 '25 12:01 jitendra-koodo

I finally got this working.

First, problem was that several libraries were not being loaded due to LD_LIBRARY_PATH not including them. I fixed it by:

export LD_LIBRARY_PATH='/path/to/lib/python3.13/site-packages/nvidia/cublas/lib/':'/path/to/lib/python3.13/site-packages/onnxruntime/capi/':'/path/to/lib/python3.13/site-packages/nvidia/cudnn/lib/'

Secondly, I explicitly specified CUDAExecutionProvider in rembg/sessions/base.py (I think the providers are not being passed in, CMIIW)

    def __init__(self, model_name: str, sess_opts: ort.SessionOptions, *args, **kwargs):
        """Initialize an instance of the BaseSession class."""
        self.model_name = model_name
        self.inner_session = ort.InferenceSession(
            str(self.__class__.download_models(*args, **kwargs)),
            sess_options=sess_opts,
            providers=["CUDAExecutionProvider"],
        )

ditesh avatar Mar 03 '25 10:03 ditesh

Nice, If I may make a suggestion, the models work great when I had it up and running in CUDA and its fast, I processed a few thousand images with it in a day on an rtx3060ti. now the TensorRT portion, my suggestion is just turn it off and ensure CUDA comes first, simply this is because the models are not build for TensorFlow . if I can find my old workings ill share my Dockerfile

akadata avatar Mar 04 '25 09:03 akadata