dinov2 clean install of conda env create -f conda-extras.yaml on Ubuntu fails to install cuml-cu11, any ideas?

failes with this error

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
Installing pip dependencies: - Ran pip subprocess with arguments:
['/home/smartreplayuser/miniconda3/envs/dinov2-extras/bin/python', '-m', 'pip', 'install', '-U', '-r', '/home/smartreplayuser/git/Facebook/dinov2/condaenv.slco21xq.requirements.txt', '--exists-action=b']
Pip subprocess output:
Looking in indexes: https://pypi.org/simple, https://pypi.nvidia.com
Collecting git+https://github.com/facebookincubator/submitit (from -r /home/smartreplayuser/git/Facebook/dinov2/condaenv.slco21xq.requirements.txt (line 1))
  Cloning https://github.com/facebookincubator/submitit to /tmp/pip-req-build-u5ykxk6q
  Resolved https://github.com/facebookincubator/submitit to commit 07f21fa1234e34151874c00d80c345e215af4967
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
  Preparing metadata (pyproject.toml): started
Preparing metadata (pyproject.toml): finished with status 'done'
Collecting cuml-cu11 (from -r /home/smartreplayuser/git/Facebook/dinov2/condaenv.slco21xq.requirements.txt (line 3))
  Downloading cuml-cu11-23.12.0.tar.gz (6.8 kB)
  Preparing metadata (setup.py): started
Preparing metadata (setup.py): finished with status 'error'

Pip subprocess error:
  Running command git clone --filter=blob:none --quiet https://github.com/facebookincubator/submitit /tmp/pip-req-build-u5ykxk6q
  error: subprocess-exited-with-error

  × python setup.py egg_info did not run successfully.
  │ exit code: 1
╰─> [16 lines of output]
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "<pip-setuptools-caller>", line 34, in <module>
File "/tmp/pip-install-j61__ee5/cuml-cu11_45092e6783154e1f9c70d31f80db8581/setup.py", line 137, in <module>
raise RuntimeError(open("ERROR.txt", "r").read())
RuntimeError:
###########################################################################################
The package you are trying to install is only a placeholder project on PyPI.org repository.
This package is hosted on NVIDIA Python Package Index.

This package can be installed as:

$ pip install --no-cache-dir --extra-index-url https://pypi.nvidia.com cuml-cu11

###########################################################################################

  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
fa
iled

CondaEnvException: Pip failed

And the recommended fix pip install --no-cache-dir --extra-index-url https://pypi.nvidia.com cuml-cu11 fails with the same error.

Dec 09 '23 03:12 lovettchris

same error, used to work well but fails since today.

Dec 09 '23 23:12 clemsgrs

also same error here: FWIW Ubuntu 22.04 with CUDA 12.2 and 535.129.03 NVidia Driver... replacing "cuml-cu11" with "cuml-cu12" did not work

Dec 10 '23 10:12 Florian2Richter

You may try to install an older version like this: $ pip install --no-cache-dir --extra-index-url https://pypi.nvidia.com cuml-cu11==23.10.0

Dec 10 '23 11:12 kuma94506

that works, thanks.

Dec 11 '23 16:12 lovettchris

interesting, for me pip install --no-cache-dir --extra-index-url https://pypi.nvidia.com/ cuml-cu11==23.10.0 didn't work either (same error ; tried older versions too)

Dec 11 '23 16:12 clemsgrs

Seems to be fixed by now... usual pip install -r requirements.txt worked fine.

Dec 12 '23 16:12 Florian2Richter

Interesting, Florian, can you post the version of CUDA, pytorch and Python that you are using?

Dec 12 '23 18:12 lovettchris

Sure, in my virtual environment (venv, not conda) is Python 3.10.12, PyTorch 2.0.0+cu117 and NVIDIA 535.129.03 with CUDA 12.2

Dec 13 '23 16:12 Florian2Richter

issues seems fixed for me. fyi I'm using the conda installation.

Dec 13 '23 18:12 clemsgrs

@Florian2Richter interestingly conda.yaml and conda-extras.yaml contains python 3.9.

In order to get the dinov2 segmentation head working on Ubuntu I had to build mmcv from source using MMCV_WITH_OPS=1 pip install -e . which required a newer version of GCC that support C++17, and then I could get the segmentation head working on CUDA, and measured about 4 seconds per inference on a Tesla T4 GPU using small backbone dinov2_vits14. I also had to install ftfy and regex pip packages. For me the "pip install mmcv-full==1.5.0" results in the error:

ModuleNotFoundError: No module named 'mmcv._ext'

Dec 13 '23 18:12 lovettchris

I can confirm that installing mmcv from the source fixed the issue when running segmentation scripts. You need to clone the specific version (not just main branch) via the following command git clone https://github.com/open-mmlab/mmcv.git --branch v1.5.3 --single-branch and install nvcc v11.7 if needed before building mmcv (conda install -c conda-forge cudatoolkit-dev=11.7). For the regular pip install I get the same error as @lovettchris.

@lovettchris, have you tried to reproduce segmentation results? I am a little bit lost about patch size: all dinov2 backbones have patch size equal to 14, however in the segmentation evaluation it is assumed to be 16 (>It is used to produce a low-resolution logit map (eg 32x32 for a model with patch size 16) and the input image size is 512, which is divisible by 16 but not 14). After modifying the config for patch size equal to 14, I can partly reproduce results for ADE20k, but not for Pascal VOC.

EDIT (22.02.24): I've managed to reproduce results both for ADE20k and Pascal VOC. Don't forget to override init_weights() method for your backbone. It is not enough to load checkpoint weights during the constructor call. Otherwise, during segmentation training weights can be overridden by default weights initialization (source).

Your backbone (dinov2/eval/segmentation/models/backbones/vision_transformer.py) should look similar to this.

Feel free to ping me if you have some issues.

Feb 15 '24 14:02 bruce-willis

I had several issues setting up the environment for segmentation properly.

I did the following:

I have CUDA 11.7 and Python 3.9.12
I edited the requirements.txt to specify the version of cuml-cu11: cuml-cu11==23.10.0 as explained by @kuma94506
Installed the requirements.txt : pip3 install -r requirements.txt. I didn't install therequirements-extra.txt: instead, I followe the next steps.
As explained by @lovettchris and @bruce-willis , I compiled mmcv-full from scratch:

git clone https://github.com/open-mmlab/mmcv.git --branch v1.5.0 --single-branch
cd mmcv/
MMCV_WITH_OPS=1 pip install -e .

I Installed mmsegmentation using: pip3 install mmsegmentation==0.27.0
I installed opencv: pip3 install opencv-python

I still get error: ModuleNotFoundError: No module named 'mmcv.ops'. @lovettchris @bruce-willis how did you fix this? Thanks

Jun 28 '24 20:06 mfoglio

dinov2 dinov2 copied to clipboard

clean install of conda env create -f conda-extras.yaml on Ubuntu fails to install cuml-cu11, any ideas?

dinov2
dinov2 copied to clipboard