MiniGPT-4 icon indicating copy to clipboard operation
MiniGPT-4 copied to clipboard

Issues with image loading and accelerate

Open aamir-gmail opened this issue 1 year ago • 5 comments

FYI , when starting the demo file I get the following message torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: libtorch_cuda_cu.so: cannot open shared object file:

the other one is to do the hugging-face accelerate

This model has some weights that should be kept in higher precision, you need to upgrade accelerate to properly deal with them (pip install --upgrade accelerate)

aamir-gmail avatar Apr 19 '23 07:04 aamir-gmail

Same for the first issue. Error message when running the demo: "/mnt/software/anaconda3/envs/minigpt4/lib/python3.9/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: libtorch_cuda_cu.so: cannot open shared object file: No such file or directory"

I have checked my cuda and pytorch installation. Looks fine. See outputs below.

import torch print(torch.version.cuda) 11.7 torch.cuda.is_available() True torch.cuda.device_count() 1 print(torch.version) 2.0.0+cu117

nvidia-smi Wed Apr 26 15:02:29 2023
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 515.43.04 Driver Version: 515.43.04 CUDA Version: 11.7 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA A10 On | 00000000:00:08.0 Off | 0 | | 0% 32C P8 21W / 150W | 0MiB / 23028MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+

Need help with the error.

yemingx avatar Apr 26 '23 07:04 yemingx

The same issue for me. It seems that pytorch=1.12.1 installed with conda has been uninstalled and upgraded to torch==2.0.0 with pip. Therefore, environment.yaml should be updated.

Installing collected packages: webencodings, wcwidth, tokenizers, sentencepiece, Send2Trash, pytz, pydub, pure-eval, ptyprocess, pickleshare, pathtools, mpmath, mistune, lit, ipython-genutils, ffmpy, fastjsonschema, executing, cymem, cmake, cchardet, braceexpand, bitsandbytes, backcall, appdirs, antlr4-python3-runtime, zipp, websockets, websocket-client, webcolors, wasabi, uri-template, uc-micro-py, tzdata, traitlets, tqdm, tornado, toolz, tinycss2, threadpoolctl, tenacity, sympy, spacy-loggers, spacy-legacy, soupsieve, sniffio, smmap, smart-open, six, setproctitle, sentry-sdk, semantic-version, scipy, rfc3986-validator, regex, pyzmq, pyyaml, python-multipart, python-json-logger, pyrsistent, pyparsing, pygments, pydantic, psutil, protobuf, prompt-toolkit, prometheus-client, portalocker, platformdirs, pexpect, parso, pandocfilters, packaging, orjson, opencv-python, nvidia-nvtx-cu11, nvidia-nccl-cu11, nvidia-cusparse-cu11, nvidia-curand-cu11, nvidia-cufft-cu11, nvidia-cuda-runtime-cu11, nvidia-cuda-nvrtc-cu11, nvidia-cuda-cupti-cu11, nvidia-cublas-cu11, networkx, nest-asyncio, murmurhash, multidict, mdurl, markupsafe, llvmlite, langcodes, kiwisolver, jupyterlab-pygments, jsonpointer, joblib, h11, fsspec, frozenlist, fqdn, fonttools, filelock, entrypoints, defusedxml, decord, decorator, debugpy, cycler, contourpy, Click, chardet, catalogue, blis, attrs, async-timeout, aiofiles, yarl, webdataset, uvicorn, typer, terminado, srsly, scikit-learn, rfc3339-validator, python-dateutil, preshed, omegaconf, nvidia-cusolver-cu11, nvidia-cudnn-cu11, numba, nltk, matplotlib-inline, markdown-it-py, linkify-it-py, jupyter-core, jsonschema, jinja2, jedi, iopath, importlib-resources, importlib-metadata, huggingface-hub, gitdb, docker-pycreds, comm, bleach, beautifulsoup4, asttokens, argon2-cffi-bindings, anyio, aiosignal, transformers, starlette, stack-data, pynndescent, pathy, pandas, nbformat, mdit-py-plugins, matplotlib, jupyter-server-terminals, jupyter-client, httpcore, gradio-client, GitPython, confection, arrow, argon2-cffi, aiohttp, wandb, umap-learn, thinc, pycocotools, openai, nbclient, isoduration, ipython, httpx, fastapi, altair, spacy, pycocoevalcap, nbconvert, ipykernel, gradio, jupyter-events, jupyter-server, notebook-shim, nbclassic, notebook, triton, torch, accelerate, timm, sentence-transformers, peft Attempting uninstall: torch Found existing installation: torch 1.12.1 Uninstalling torch-1.12.1: Successfully uninstalled torch-1.12.1 Successfully installed Click-8.1.3 GitPython-3.1.31 Send2Trash-1.8.0 accelerate-0.16.0 aiofiles-23.1.0 aiohttp-3.8.4 aiosignal-1.3.1 altair-4.2.2 antlr4-python3-runtime-4.9.3 anyio-3.6.2 appdirs-1.4.4 argon2-cffi-21.3.0 argon2-cffi-bindings-21.2.0 arrow-1.2.3 asttokens-2.2.1 async-timeout-4.0.2 attrs-22.2.0 backcall-0.2.0 beautifulsoup4-4.12.2 bitsandbytes-0.37.0 bleach-6.0.0 blis-0.7.9 braceexpand-0.1.7 catalogue-2.0.8 cchardet-2.1.7 chardet-5.1.0 cmake-3.26.3 comm-0.1.3 confection-0.0.4 contourpy-1.0.7 cycler-0.11.0 cymem-2.0.7 debugpy-1.6.7 decorator-5.1.1 decord-0.6.0 defusedxml-0.7.1 docker-pycreds-0.4.0 entrypoints-0.4 executing-1.2.0 fastapi-0.95.1 fastjsonschema-2.16.3 ffmpy-0.3.0 filelock-3.9.0 fonttools-4.38.0 fqdn-1.5.1 frozenlist-1.3.3 fsspec-2023.4.0 gitdb-4.0.10 gradio-3.24.1 gradio-client-0.0.8 h11-0.14.0 httpcore-0.17.0 httpx-0.24.0 huggingface-hub-0.13.4 importlib-metadata-6.6.0 importlib-resources-5.12.0 iopath-0.1.10 ipykernel-6.22.0 ipython-8.12.0 ipython-genutils-0.2.0 isoduration-20.11.0 jedi-0.18.2 jinja2-3.1.2 joblib-1.2.0 jsonpointer-2.3 jsonschema-4.17.3 jupyter-client-8.2.0 jupyter-core-5.3.0 jupyter-events-0.6.3 jupyter-server-2.5.0 jupyter-server-terminals-0.4.4 jupyterlab-pygments-0.2.2 kiwisolver-1.4.4 langcodes-3.3.0 linkify-it-py-2.0.0 lit-16.0.2 llvmlite-0.39.1 markdown-it-py-2.2.0 markupsafe-2.1.2 matplotlib-3.7.0 matplotlib-inline-0.1.6 mdit-py-plugins-0.3.3 mdurl-0.1.2 mistune-2.0.5 mpmath-1.3.0 multidict-6.0.4 murmurhash-1.0.9 nbclassic-0.5.5 nbclient-0.7.4 nbconvert-7.3.1 nbformat-5.8.0 nest-asyncio-1.5.6 networkx-3.1 nltk-3.8.1 notebook-6.5.4 notebook-shim-0.2.3 numba-0.56.4 nvidia-cublas-cu11-11.10.3.66 nvidia-cuda-cupti-cu11-11.7.101 nvidia-cuda-nvrtc-cu11-11.7.99 nvidia-cuda-runtime-cu11-11.7.99 nvidia-cudnn-cu11-8.5.0.96 nvidia-cufft-cu11-10.9.0.58 nvidia-curand-cu11-10.2.10.91 nvidia-cusolver-cu11-11.4.0.1 nvidia-cusparse-cu11-11.7.4.91 nvidia-nccl-cu11-2.14.3 nvidia-nvtx-cu11-11.7.91 omegaconf-2.3.0 openai-0.27.0 opencv-python-4.7.0.72 orjson-3.8.10 packaging-23.0 pandas-2.0.1 pandocfilters-1.5.0 parso-0.8.3 pathtools-0.1.2 pathy-0.10.1 peft-0.2.0 pexpect-4.8.0 pickleshare-0.7.5 platformdirs-3.3.0 portalocker-2.7.0 preshed-3.0.8 prometheus-client-0.16.0 prompt-toolkit-3.0.38 protobuf-4.22.3 psutil-5.9.4 ptyprocess-0.7.0 pure-eval-0.2.2 pycocoevalcap-1.2 pycocotools-2.0.6 pydantic-1.10.7 pydub-0.25.1 pygments-2.15.1 pynndescent-0.5.10 pyparsing-3.0.9 pyrsistent-0.19.3 python-dateutil-2.8.2 python-json-logger-2.0.7 python-multipart-0.0.6 pytz-2023.3 pyyaml-6.0 pyzmq-25.0.2 regex-2022.10.31 rfc3339-validator-0.1.4 rfc3986-validator-0.1.1 scikit-learn-1.2.2 scipy-1.10.1 semantic-version-2.10.0 sentence-transformers-2.2.2 sentencepiece-0.1.98 sentry-sdk-1.21.0 setproctitle-1.3.2 six-1.16.0 smart-open-6.3.0 smmap-5.0.0 sniffio-1.3.0 soupsieve-2.4.1 spacy-3.5.1 spacy-legacy-3.0.12 spacy-loggers-1.0.4 srsly-2.4.6 stack-data-0.6.2 starlette-0.26.1 sympy-1.11.1 tenacity-8.2.2 terminado-0.17.1 thinc-8.1.9 threadpoolctl-3.1.0 timm-0.6.13 tinycss2-1.2.1 tokenizers-0.13.2 toolz-0.12.0 torch-2.0.0 tornado-6.3.1 tqdm-4.64.1 traitlets-5.9.0 transformers-4.28.0 triton-2.0.0 typer-0.7.0 tzdata-2023.3 uc-micro-py-1.0.1 umap-learn-0.5.3 uri-template-1.2.0 uvicorn-0.21.1 wandb-0.15.0 wasabi-1.1.1 wcwidth-0.2.6 webcolors-1.13 webdataset-0.2.48 webencodings-0.5.1 websocket-client-1.5.1 websockets-11.0.2 yarl-1.8.2 zipp-3.14.0

Then I tried to specify torch version in pip package list in environment.yml as follows:

...
  - pip:
    - --extra-index-url https://download.pytorch.org/whl/cu113
    - torch==1.12.0+cu113
...

But I got the following error.

INFO: pip is looking at multiple versions of torch to determine which version is compatible with other requirements. This could take a while.

The conflict is caused by: The user requested torch==1.12.0+cu113 accelerate 0.16.0 depends on torch>=1.4.0 timm 0.6.13 depends on torch>=1.7 peft 0.2.0 depends on torch>=1.13.0 The user requested torch==1.12.0+cu113 accelerate 0.16.0 depends on torch>=1.4.0 timm 0.6.13 depends on torch>=1.7 peft 0.1.0 depends on torch>=1.13.0 The user requested torch==1.12.0+cu113 accelerate 0.16.0 depends on torch>=1.4.0 timm 0.6.13 depends on torch>=1.7 peft 0.0.2 depends on torch>=1.13.0 The user requested torch==1.12.0+cu113 accelerate 0.16.0 depends on torch>=1.4.0 timm 0.6.13 depends on torch>=1.7 peft 0.0.1 depends on torch>=1.13.0

I have no idea how can I solve this because there is no torch compatible with cu113.

getpa avatar Apr 26 '23 10:04 getpa

My service ran into this issue as well. I solved it by:

cd MiniGPT-4
conda env update

You can check effective lib versions by conda list. If there are still lib conflicts, try:

conda deactivate
source ~/.bashrc
conda activate minigpt4

thiner avatar Apr 26 '23 11:04 thiner

just solved the following error: torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: libtorch_cuda_cu.so: cannot open shared object file:

To solve, you can just rewrite environment.yml as follows.

name: minigpt4
channels:
  - pytorch
  - defaults
  - anaconda
  - nvidia
dependencies:
  - python=3.9
  - cudatoolkit=11.8.0
  - pip
  - pytorch
  - pytorch-cuda=11.8
  - torchaudio
  - torchvision
  - pip:
    - accelerate==0.16.0
    - aiohttp==3.8.4
    - aiosignal==1.3.1
    - async-timeout==4.0.2
    - attrs==22.2.0
    - bitsandbytes==0.38.0
    - cchardet==2.1.7
    - chardet==5.1.0
    - contourpy==1.0.7
    - cycler==0.11.0
    - filelock==3.9.0
    - fonttools==4.38.0
    - frozenlist==1.3.3
    - huggingface-hub==0.13.4
    - importlib-resources==5.12.0
    - kiwisolver==1.4.4
    - matplotlib==3.7.0
    - multidict==6.0.4
    - openai==0.27.0
    - packaging==23.0
    - psutil==5.9.4
    - pycocotools==2.0.6
    - pyparsing==3.0.9
    - python-dateutil==2.8.2
    - pyyaml==6.0
    - regex==2022.10.31
    - tokenizers==0.13.2
    - tqdm==4.64.1
    - transformers==4.28.0
    - timm==0.6.13
    - spacy==3.5.1
    - webdataset==0.2.48
    - scikit-learn==1.2.2
    - scipy==1.10.1
    - yarl==1.8.2
    - zipp==3.14.0
    - omegaconf==2.3.0
    - opencv-python==4.7.0.72
    - iopath==0.1.10
    - decord==0.6.0
    - tenacity==8.2.2
    - peft
    - pycocoevalcap
    - sentence-transformers
    - umap-learn
    - notebook
    - gradio==3.24.1
    - gradio-client==0.0.8
    - wandb

But unfortunately, I'm still unable to interact with miniGPT4 due to following error when I upload an image.

Traceback (most recent call last):
  File "/work/s183313/.pyenv/versions/mambaforge/envs/minigpt4/lib/python3.9/site-packages/gradio/routes.py", line 393, in run_predict
    output = await app.get_blocks().process_api(
  File "/work/s183313/.pyenv/versions/mambaforge/envs/minigpt4/lib/python3.9/site-packages/gradio/blocks.py", line 1108, in process_api
    result = await self.call_function(
  File "/work/s183313/.pyenv/versions/mambaforge/envs/minigpt4/lib/python3.9/site-packages/gradio/blocks.py", line 915, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/work/s183313/.pyenv/versions/mambaforge/envs/minigpt4/lib/python3.9/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/work/s183313/.pyenv/versions/mambaforge/envs/minigpt4/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/work/s183313/.pyenv/versions/mambaforge/envs/minigpt4/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/work/s183313/MiniGPT-4/demo.py", line 83, in upload_img
    llm_message = chat.upload_img(gr_img, chat_state, img_list)
  File "/work/s183313/MiniGPT-4/minigpt4/conversation/conversation.py", line 185, in upload_img
    image_emb, _ = self.model.encode_img(image)
  File "/work/s183313/MiniGPT-4/minigpt4/models/mini_gpt4.py", line 139, in encode_img
    query_output = self.Qformer.bert(
  File "/work/s183313/.pyenv/versions/mambaforge/envs/minigpt4/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/work/s183313/MiniGPT-4/minigpt4/models/Qformer.py", line 937, in forward
    encoder_outputs = self.encoder(
  File "/work/s183313/.pyenv/versions/mambaforge/envs/minigpt4/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/work/s183313/MiniGPT-4/minigpt4/models/Qformer.py", line 550, in forward
    layer_outputs = layer_module(
  File "/work/s183313/.pyenv/versions/mambaforge/envs/minigpt4/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/work/s183313/MiniGPT-4/minigpt4/models/Qformer.py", line 417, in forward
    self_attention_outputs = self.attention(
  File "/work/s183313/.pyenv/versions/mambaforge/envs/minigpt4/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/work/s183313/MiniGPT-4/minigpt4/models/Qformer.py", line 332, in forward
    self_outputs = self.self(
  File "/work/s183313/.pyenv/versions/mambaforge/envs/minigpt4/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/work/s183313/MiniGPT-4/minigpt4/models/Qformer.py", line 195, in forward
    key_layer = self.transpose_for_scores(self.key(hidden_states))
  File "/work/s183313/.pyenv/versions/mambaforge/envs/minigpt4/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/work/s183313/.pyenv/versions/mambaforge/envs/minigpt4/lib/python3.9/site-packages/torch/nn/modules/linear.py", line 114, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling `cublasCreate(handle)`

Does anyone have idea?

getpa avatar Apr 26 '23 13:04 getpa

I think you should not modify the requirement.txt file. Because the libs have dependencies to each other. The problem is caused by pytorch in my opinion, the pytorch version was upgraded to 2.0.0 in my case, which is not the specified version number in the original requirement.txt file. I guess it was upgraded by other lib. You need to update the conda env "minigpt4" to fix pytorch version by conda env update.

thiner avatar Apr 26 '23 14:04 thiner

Did anyone solve the issue?

sushilkhadkaanon avatar Sep 18 '23 19:09 sushilkhadkaanon

Same for the first issue. Error message when running the demo: "/mnt/software/anaconda3/envs/minigpt4/lib/python3.9/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: libtorch_cuda_cu.so: cannot open shared object file: No such file or directory"

I have checked my cuda and pytorch installation. Looks fine. See outputs below.

import torch print(torch.version.cuda) 11.7 torch.cuda.is_available() True torch.cuda.device_count() 1 print(torch.version) 2.0.0+cu117

nvidia-smi Wed Apr 26 15:02:29 2023 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 515.43.04 Driver Version: 515.43.04 CUDA Version: 11.7 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA A10 On | 00000000:00:08.0 Off | 0 | | 0% 32C P8 21W / 150W | 0MiB / 23028MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+

Need help with the error.

I ran into the same issue. Did you solve ?

sushilkhadkaanon avatar Sep 18 '23 19:09 sushilkhadkaanon

I solvee this issue. The image warning is just a warning , you can ignore that. Won't affect while doing inference.

  1. pip install --upgrade accelerate
  2. In my case it was because of GPU memory capacity, I was able to run inference on 7B model .

sushilkhadkaanon avatar Sep 19 '23 11:09 sushilkhadkaanon