dinov2 icon indicating copy to clipboard operation
dinov2 copied to clipboard

Environment installation problem

Open wyh196646 opened this issue 1 year ago • 10 comments

I want to install the environment on 8*4090 machine fellow the instruction in readme.md Could not solve for environment specs The following packages are incompatible ├─ pytorch-cuda 11.7.0* is requested and can be installed; ├─ pytorch 2.0.0* is installable with the potential options │ ├─ pytorch 2.0.0 would require │ │ └─ pytorch-mutex 1.0 cpu, which can be installed; │ ├─ pytorch 2.0.0 would require │ │ └─ pytorch-mutex 1.0 cuda, which conflicts with any installable versions previously reported; │ └─ pytorch 2.0.0 would require │ ├─ pytorch-cuda >=11.8,<11.9 , which conflicts with any installable versions previously reported; │ └─ pytorch-mutex 1.0 cuda, which conflicts with any installable versions previously reported; ├─ torchvision 0.15.0* is installable with the potential options │ ├─ torchvision 0.15.0 would require │ │ └─ pytorch-mutex 1.0 cpu, which can be installed; │ ├─ torchvision 0.15.0 would require │ │ ├─ pytorch-cuda 11.7.* , which can be installed; │ │ └─ pytorch-mutex 1.0 cuda, which conflicts with any installable versions previously reported; │ └─ torchvision 0.15.0 would require │ └─ pytorch-cuda 11.8.* , which conflicts with any installable versions previously reported; └─ xformers 0.0.18* is not installable because there are no viable options ├─ xformers 0.0.18 would require │ └─ pytorch 1.12.1.* but there are no viable options │ ├─ pytorch 1.12.1 conflicts with any installable versions previously reported; │ ├─ pytorch [1.12.1|1.13.1] would require │ │ └─ pytorch-mutex 1.0 cpu, which can be installed; │ └─ pytorch [1.12.1|1.13.1] would require │ └─ pytorch-mutex 1.0 cuda, which conflicts with any installable versions previously reported; └─ xformers 0.0.18 would require └─ pytorch 1.13.1.* but there are no viable options ├─ pytorch [1.12.1|1.13.1], which cannot be installed (as previously explained); ├─ pytorch [1.12.1|1.13.1], which cannot be installed (as previously explained); ├─ pytorch 1.13.1 conflicts with any installable versions previously reported; └─ pytorch 1.13.1 would require ├─ pytorch-cuda >=11.6,<11.7 , which conflicts with any installable versions previously reported; └─ pytorch-mutex 1.0 cuda, which conflicts with any installable versions previously reported.

wyh196646 avatar Nov 09 '24 09:11 wyh196646

I want to install the environment on 8*4090 machine fellow the instruction in readme.md Could not solve for environment specs The following packages are incompatible ├─ pytorch-cuda 11.7.0* is requested and can be installed; ├─ pytorch 2.0.0* is installable with the potential options │ ├─ pytorch 2.0.0 would require │ │ └─ pytorch-mutex 1.0 cpu, which can be installed; │ ├─ pytorch 2.0.0 would require │ │ └─ pytorch-mutex 1.0 cuda, which conflicts with any installable versions previously reported; │ └─ pytorch 2.0.0 would require │ ├─ pytorch-cuda >=11.8,<11.9 , which conflicts with any installable versions previously reported; │ └─ pytorch-mutex 1.0 cuda, which conflicts with any installable versions previously reported; ├─ torchvision 0.15.0* is installable with the potential options │ ├─ torchvision 0.15.0 would require │ │ └─ pytorch-mutex 1.0 cpu, which can be installed; │ ├─ torchvision 0.15.0 would require │ │ ├─ pytorch-cuda 11.7.* , which can be installed; │ │ └─ pytorch-mutex 1.0 cuda, which conflicts with any installable versions previously reported; │ └─ torchvision 0.15.0 would require │ └─ pytorch-cuda 11.8.* , which conflicts with any installable versions previously reported; └─ xformers 0.0.18* is not installable because there are no viable options ├─ xformers 0.0.18 would require │ └─ pytorch 1.12.1.* but there are no viable options │ ├─ pytorch 1.12.1 conflicts with any installable versions previously reported; │ ├─ pytorch [1.12.1|1.13.1] would require │ │ └─ pytorch-mutex 1.0 cpu, which can be installed; │ └─ pytorch [1.12.1|1.13.1] would require │ └─ pytorch-mutex 1.0 cuda, which conflicts with any installable versions previously reported; └─ xformers 0.0.18 would require └─ pytorch 1.13.1.* but there are no viable options ├─ pytorch [1.12.1|1.13.1], which cannot be installed (as previously explained); ├─ pytorch [1.12.1|1.13.1], which cannot be installed (as previously explained); ├─ pytorch 1.13.1 conflicts with any installable versions previously reported; └─ pytorch 1.13.1 would require ├─ pytorch-cuda >=11.6,<11.7 , which conflicts with any installable versions previously reported; └─ pytorch-mutex 1.0 cuda, which conflicts with any installable versions previously reported.

I got the same error problem. Looking for a fix

Tan-B24 avatar Nov 12 '24 08:11 Tan-B24

I guess all the issues arise from the OpenMM Lab (MMCV and MMseg) dependencies. I appreciate the effort that the authors have put into this work. But frankly, I have no idea why so many papers from Facebook AI Research (and Microsoft as well) have an OpenMMLab dependency. They break after 6 months. It's a nightmare to install.

I handle a lot of installation because I am an applied Deep Learning engineer and spend a lot of time coding in deep learning. And I am stuck for 9 hours now (yup) trying to go through numerous permutations of PyTorch, MMCV, and MMsegmentation.

The easiest way (which I think will work) is to install CUDA toolkit 11.7 and then run the conda install from the project. However, I believe hardly anyone will have global CUDA 11.7 in 2024. I have CUDA 12.x globally enabled and cannot risk messing up the system with multiple global CUDA versions.

sovit-123 avatar Nov 23 '24 10:11 sovit-123

Just want to mention that I also have this problem. It seems that there is a conda conflict with xformers=0.0.18 and pytorch=2.0.

Full conda error
    The following packages are incompatible
    ├─ python 3.9**  is installable with the potential options
    │  ├─ python [3.9.0|3.9.1|...|3.9.7], which can be installed;
    │  └─ python [3.9.0|3.9.1|...|3.9.9] would require
    │     └─ python_abi 3.9.* *_cp39, which can be installed;
    ├─ pytorch 2.0.0*  is installable with the potential options
    │  ├─ pytorch [1.12.1|1.13.1|2.0.0] would require
    │  │  └─ python >=3.10,<3.11.0a0 , which conflicts with any installable versions previously reported;
    │  ├─ pytorch [1.12.1|1.13.1|2.0.0] would require
    │  │  └─ python >=3.8,<3.9.0a0 , which conflicts with any installable versions previously reported;
    │  └─ pytorch 2.0.0, which can be installed;
    └─ xformers 0.0.18*  is not installable because there are no viable options
       ├─ xformers 0.0.18 would require
       │  └─ pytorch 1.13.1.*  but there are no viable options
       │     ├─ pytorch [1.12.1|1.13.1|2.0.0], which cannot be installed (as previously explained);
       │     ├─ pytorch [1.12.1|1.13.1] would require
       │     │  └─ python >=3.7,<3.8.0a0 , which conflicts with any installable versions previously reported;
       │     ├─ pytorch [1.12.1|1.13.1|2.0.0], which cannot be installed (as previously explained);
       │     ├─ pytorch [1.12.1|1.13.1] would require
       │     │  ├─ python >=3.10,<3.11.0a0 , which conflicts with any installable versions previously reported;
       │     │  └─ python_abi 3.10.* *_cp310, which conflicts with any installable versions previously reported;
       │     ├─ pytorch [1.12.1|1.13.1] would require
       │     │  ├─ python >=3.8,<3.9.0a0 , which conflicts with any installable versions previously reported;
       │     │  └─ python_abi 3.8.* *_cp38, which conflicts with any installable versions previously reported;
       │     ├─ pytorch 1.13.1 would require
       │     │  └─ python >=3.11,<3.12.0a0 , which conflicts with any installable versions previously reported;
       │     ├─ pytorch 1.13.1 conflicts with any installable versions previously reported;
       │     └─ pytorch 1.13.1 would require
       │        ├─ python >=3.11,<3.12.0a0 , which conflicts with any installable versions previously reported;
       │        └─ python_abi 3.11.* *_cp311, which conflicts with any installable versions previously reported;
       ├─ xformers 0.0.18 would require
       │  └─ pytorch 1.12.1.*  but there are no viable options
       │     ├─ pytorch [1.12.1|1.13.1|2.0.0], which cannot be installed (as previously explained);
       │     ├─ pytorch [1.12.1|1.13.1], which cannot be installed (as previously explained);
       │     ├─ pytorch [1.12.1|1.13.1|2.0.0], which cannot be installed (as previously explained);
       │     ├─ pytorch 1.12.1 conflicts with any installable versions previously reported;
       │     ├─ pytorch [1.12.1|1.13.1], which cannot be installed (as previously explained);
       │     ├─ pytorch 1.12.1 would require
       │     │  ├─ python >=3.7,<3.8.0a0 , which conflicts with any installable versions previously reported;
       │     │  └─ python_abi 3.7.* *_cp37m, which conflicts with any installable versions previously reported;
       │     └─ pytorch [1.12.1|1.13.1], which cannot be installed (as previously explained);
       └─ xformers 0.0.18 would require
          └─ python >=3.10,<3.11.0a0 , which conflicts with any installable versions previously reported.
critical libmamba Could not solve for environment specs

I can replicate this error with this minimal environment file:

name: dinov2
channels:
  - defaults
  - pytorch
  - nvidia
  - xformers
  - conda-forge
dependencies:
  - pytorch::pytorch=2.0.0
  - xformers::xformers=0.0.18
Conda error for the minimal environment file.
error    libmamba Could not solve for environment specs
    The following packages are incompatible
    ├─ pytorch 2.0.0*  is requested and can be installed;
    └─ xformers 0.0.18*  is not installable because there are no viable options
       ├─ xformers 0.0.18 would require
       │  └─ pytorch 1.13.1.* , which conflicts with any installable versions previously reported;
       └─ xformers 0.0.18 would require
          └─ pytorch 1.12.1.* , which conflicts with any installable versions previously reported.
critical libmamba Could not solve for environment specs

The main README.md says that it is only tested on these two versions, so, maybe these versions have changed underneath this repo?

Multihuntr avatar Nov 29 '24 05:11 Multihuntr

I found a solution that seems to work. I initially had CUDA 12.4 installed globally on my system. This project requires CUDA 11.7, however, I was unwilling the uninstall the current version and a bit hesitant to install two CUDA versions.

Still, I went ahead and installed CUDA 11.7 through local runfile steps. The image below shows the settings. After installation and adding the paths to bashrc, everything seems to work.

image

sovit-123 avatar Nov 29 '24 08:11 sovit-123

It's a conflict between xformers=0.0.18 and pytorch=2.0. I don't see how the CUDA version has anything to do with it. CUDA is backwards compatible, so having CUDA 12.x instead of CUDA 11.7 toolkit shouldn't make a difference if you want to install pytorch::pytorch-cuda=11.7.0.

But I tried it anyway, just to make sure. It didn't change anything for me.
# min.yaml
name: dinov2
channels:
  - nvidia
  - pytorch
  - xformers
  - conda-forge
dependencies:
  - pytorch::pytorch=2.0.0
  - pytorch::pytorch-cuda=11.7.*
  - xformers::xformers=0.0.18
FROM nvidia/cuda:11.7.1-devel-ubuntu22.04

RUN apt-get update && apt-get install -y curl gcc
RUN cd /root && curl -o ./install.sh -L micro.mamba.pm/install.sh && chmod +x ./install.sh && bash ./install.sh
ADD min.yaml /root
root@c5f7afb34614:/# cd root/ && micromamba create --file min.yaml
[... snip ...]
error    libmamba Could not solve for environment specs
    The following packages are incompatible
    ├─ pytorch =2.0.0 * is requested and can be installed;
    └─ xformers =0.0.18 * is not installable because there are no viable options
       ├─ xformers 0.0.18 would require
       │  └─ pytorch =1.13.1 *, which conflicts with any installable versions previously reported;
       └─ xformers 0.0.18 would require
          └─ pytorch =1.12.1 *, which conflicts with any installable versions previously reported.
critical libmamba Could not solve for environment specs

So, I'm going to try using xformers=0.0.19, which is allowed with pytorch=2.0. The environment installs with that change, but there might be some reason why that's not good. I'll report back if I run into any problems doing that.

Multihuntr avatar Dec 06 '24 02:12 Multihuntr

@Multihuntr I was actually referring to the full installation which requires MMSegmentation for segmentation and depth estimation purposes. I was able to fix that with CUDA 11.7 installation. However, I am still unable to install xformers properly for this project. For me, it it is not much of an issue, as I am loading the model independently and writing my own training and inference script.

sovit-123 avatar Dec 06 '24 02:12 sovit-123

I encountered the same issue on 5 RTX 4090 GPUs. I commented out the xformer part in the config file to set up the environment, and then installed the requirements using pip. Training and testing can be performed successfully, but both can only run on a single GPU, and multi-GPU training and testing are not working. I haven’t figured out the reason yet.

chasm002 avatar Jan 13 '25 12:01 chasm002

@Multihuntr did you run into any problems?

abivaladez avatar Apr 03 '25 00:04 abivaladez

It ran to completion. For my particular use case, the resulting model didn't work well, but I think that's more to do with my data than the code here.

I also noticed that there is code to turn off xformers entirely, and to handle the case that xformers isn't installed at all. e.g. https://github.com/facebookresearch/dinov2/blob/e1277af2ba9496fbadf7aec6eba56e8d882d1e35/dinov2/layers/swiglu_ffn.py#L37-L51

So, you could either use xformers=0.0.19, and it will definitely run. Or you could not install it at all and it will also probably run.

But, I make no guarantees if that's actually intended by the original authors, nor do I know if it will give precisely the same results as their paper.

Multihuntr avatar Apr 03 '25 01:04 Multihuntr

I guess all the issues arise from the OpenMM Lab (MMCV and MMseg) dependencies. I appreciate the effort that the authors have put into this work. But frankly, I have no idea why so many papers from Facebook AI Research (and Microsoft as well) have an OpenMMLab dependency. They break after 6 months. It's a nightmare to install.

I handle a lot of installation because I am an applied Deep Learning engineer and spend a lot of time coding in deep learning. And I am stuck for 9 hours now (yup) trying to go through numerous permutations of PyTorch, MMCV, and MMsegmentation.

The easiest way (which I think will work) is to install CUDA toolkit 11.7 and then run the conda install from the project. However, I believe hardly anyone will have global CUDA 11.7 in 2024. I have CUDA 12.x globally enabled and cannot risk messing up the system with multiple global CUDA versions.

@sovit-123 Just FYI, cuda-toolkit can be installed as part of the conda virual environmrnt, it will never mess up with the default system one. But yeah I do agree mmcv is a nightmare.

JianwenCao avatar Apr 15 '25 21:04 JianwenCao