Error: "No supported gpu backend found!"

Open Sizonnayak opened this issue 11 months ago • 1 comments

Hi team, i am getting this error after running this module in docker. But when i am running in conda env, its running fine. Any solution ?

$ boltz predict examples/prot_custom_msa.yaml --out_dir /run_output

Downloading the CCD dictionary to /root/.boltz/ccd.pkl. You may change the cache directory with the --cache flag. Downloading the model weights to /root/.boltz/boltz1_conf.ckpt. You may change the cache directory with the --cache flag. Checking input data. Running predictions for 1 structure Processing input data. 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 27.57it/s] Traceback (most recent call last): File "/usr/local/bin/boltz", line 8, in sys.exit(cli()) File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1157, in call return self.main(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1688, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, **ctx.params) File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 783, in invoke return __callback(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/boltz/main.py", line 669, in predict trainer = Trainer( File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/utilities/argparse.py", line 70, in insert_env_defaults return fn(self, **kwargs) File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 395, in init self._accelerator_connector = _AcceleratorConnector( File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py", line 143, in init self._accelerator_flag = self._choose_gpu_accelerator_backend() File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py", line 353, in _choose_gpu_accelerator_backend raise MisconfigurationException("No supported gpu backend found!") lightning_fabric.utilities.exceptions.MisconfigurationException: No supported gpu backend found!

May 06 '25 07:05 Sizonnayak

I ran into this issue.

FROM nvidia/cuda:12.8.0-runtime-ubuntu24.04

RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
    --mount=type=cache,target=/var/lib/apt,sharing=locked \
    --mount=type=cache,target=/root/.cache/pip \
    apt-get update && \
    apt-get install -y python3 python3-pip && \
    pip install boltz[cuda] -U --break-system-packages

For the above Dockerfile I seemed to get this issue. This apparently was happening as Boltz dependencies were only being installed with a CPU version.

I was able to fix it by breaking out the torch stuff seperately as so:

FROM nvidia/cuda:12.8.0-runtime-ubuntu24.04

RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
    --mount=type=cache,target=/var/lib/apt,sharing=locked \
    --mount=type=cache,target=/root/.cache/pip \
    apt-get update && \
    apt-get install -y python3 python3-pip && \
    pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128 --break-system-packages && \
    pip install boltz[cuda] -U --break-system-packages

Sep 29 '25 23:09 gheffern