mega.pytorch icon indicating copy to clipboard operation
mega.pytorch copied to clipboard

failed running python setup.py build develop for mega.pytorch

Open LilyDaytoy opened this issue 1 year ago • 7 comments

I followed install.md when running command python setup.py build develop

my nvcc --version is cuda 10.1 I tried conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch conda install pytorch==1.5.0 torchvision==0.6.0 cudatoolkit=10.1 -c pytorch
conda install pytorch==1.7.0 torchvision==0.8.0 torchaudio==0.7.0 cudatoolkit=10.1 -c pytorch all cannot

LilyDaytoy avatar Aug 17 '22 06:08 LilyDaytoy

raise RuntimeError(message) RuntimeError: Error compiling objects for extension

LilyDaytoy avatar Aug 17 '22 06:08 LilyDaytoy

This is my error message: python setup.py build develop running build running build_py running build_ext building 'mega_core._C' extension /mnt/lustre/share/gcc/gcc-5.3.0/bin -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/mnt/lustre/wxpeng/MSG/mega.pytorch/mega_core/csrc -I/mnt/lustre/wxpeng/anaconda3/envs/mega/lib/python3.7/site-packages/torch/include -I/mnt/lustre/wxpeng/anaconda3/envs/mega/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/mnt/lustre/wxpeng/anaconda3/envs/mega/lib/python3.7/site-packages/torch/include/TH -I/mnt/lustre/wxpeng/anaconda3/envs/mega/lib/python3.7/site-packages/torch/include/THC -I/mnt/lustre/share/cuda-10.0/include -I/mnt/lustre/wxpeng/anaconda3/envs/mega/include/python3.7m -c /mnt/lustre/wxpeng/MSG/mega.pytorch/mega_core/csrc/cpu/ROIAlign_cpu.cpp -o build/temp.linux-x86_64-3.7/mnt/lustre/wxpeng/MSG/mega.pytorch/mega_core/csrc/cpu/ROIAlign_cpu.o -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11 error: command '/mnt/lustre/share/gcc/gcc-5.3.0/bin' failed: Permission denied

Could you help me check what is going on? Thanks a lot!

LilyDaytoy avatar Aug 17 '22 07:08 LilyDaytoy

The current codebase only works with pytorch 1.3.0 (or lower), as mentioned in the INSTALL.md. So you may try a older version of pytorch.

Scalsol avatar Aug 17 '22 07:08 Scalsol

Hi! I tried using exactly conda install pytorch=1.3.0 torchvision cudatoolkit=10.0 -c pytorch, and I also change my nvcc version to 10.0, but still failed like this

python setup.py build develop running build running build_py running build_ext building 'mega_core._C' extension /mnt/lustre/share/gcc/gcc-5.3.0/bin -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/mnt/lustre/wxpeng/MSG/mega.pytorch/mega_core/csrc -I/mnt/lustre/wxpeng/anaconda3/envs/mega/lib/python3.7/site-packages/torch/include -I/mnt/lustre/wxpeng/anaconda3/envs/mega/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/mnt/lustre/wxpeng/anaconda3/envs/mega/lib/python3.7/site-packages/torch/include/TH -I/mnt/lustre/wxpeng/anaconda3/envs/mega/lib/python3.7/site-packages/torch/include/THC -I/mnt/lustre/share/cuda-10.0/include -I/mnt/lustre/wxpeng/anaconda3/envs/mega/include/python3.7m -c /mnt/lustre/wxpeng/MSG/mega.pytorch/mega_core/csrc/cpu/ROIAlign_cpu.cpp -o build/temp.linux-x86_64-3.7/mnt/lustre/wxpeng/MSG/mega.pytorch/mega_core/csrc/cpu/ROIAlign_cpu.o -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11 error: command '/mnt/lustre/share/gcc/gcc-5.3.0/bin' failed: Permission denied

LilyDaytoy avatar Aug 17 '22 07:08 LilyDaytoy

my versions: nvcc: 10.0 gcc: 5.3.0 pytorch 1.3.0

Also for

cd cocoapi/PythonAPI
python setup.py build_ext install

I also encountered this issue: error: command '/mnt/lustre/share/gcc/gcc-5.3.0/bin' failed: Permission denied so I used

conda install -c conda-forge pycocotools

is it ok?

LilyDaytoy avatar Aug 17 '22 07:08 LilyDaytoy

It seems that it's a permission issue with your gcc directory. So maybe try to update the folder permissions by chmod.

Scalsol avatar Aug 17 '22 08:08 Scalsol

Ohh, thanks a lot! I found the error was actually because of my gcc dir, I add CC=gcc before python setup.py build develop, and the problem is solved. But I encountered another problem, when inferencing the model, there is an error AttributeError: module 'torch.cuda' has no attribute 'amp' (apex/apex/transformer/amp/grad_scaler.py line 21), I searched online and they say amp is only available after pytorch 1.6, but the repo only support pytorch 1.3 and lower, so is there any way to solve this problem? Thanks for your patience :D

LilyDaytoy avatar Aug 17 '22 12:08 LilyDaytoy