pytorch icon indicating copy to clipboard operation
pytorch copied to clipboard

Cannot install from source with rocm.

Open cmal opened this issue 4 years ago • 1 comments

🐛 Bug

export USE_NINJA=1 USE_CUDA=0 USE_ROCM=1 USE_LMDB=1 USE_OPENCV=1 MAX_JOBS=2; python setup.py install ... [ 95%] Built target c10d make: *** [Makefile:141:all] Error 2 Traceback (most recent call last): File "setup.py", line 724, in build_deps() File "setup.py", line 317, in build_deps cmake=cmake) File "/home/cmal/gits/pytorch_rocm/tools/build_pytorch_libs.py", line 62, in build_caffe2 cmake.build(my_env) File "/home/cmal/gits/pytorch_rocm/tools/setup_helpers/cmake.py", line 346, in build self.run(build_args, my_env) File "/home/cmal/gits/pytorch_rocm/tools/setup_helpers/cmake.py", line 141, in run check_call(command, cwd=self.build_dir, env=env) File "/usr/lib/python3.7/subprocess.py", line 363, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['cmake', '--build', '.', '--target', 'install', '--config', 'Release', '--', '-j', '5']' returned non-zero exit status 2.

To Reproduce

Steps to reproduce the behavior:

  1. export USE_NINJA=1 USE_CUDA=0 USE_ROCM=1 USE_LMDB=1 USE_OPENCV=1 MAX_JOBS=2; python setup.py install
  2. waiting for [95%]

Expected behavior

successfully compiled

Environment

Please copy and paste the output from our environment collection script (or fill out the checklist below manually).

You can get the script and run it with:

wget https://raw.githubusercontent.com/pytorch/pytorch/master/torch/utils/collect_env.py
# For security purposes, please check the contents of collect_env.py before running it.
python collect_env.py

UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:100.) return torch._C._cuda_getDeviceCount() > 0 PyTorch version: 1.7.0 Is debug build: True CUDA used to build PyTorch: 10.2 ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.1 LTS (x86_64) GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 Clang version: Could not collect CMake version: version 3.16.3

Python version: 3.7 (64-bit runtime) Is CUDA available: False CUDA runtime version: No CUDA GPU models and configuration: No CUDA Nvidia driver version: No CUDA cuDNN version: No CUDA HIP runtime version: N/A MIOpen runtime version: N/A

Versions of relevant libraries: [pip3] numpy==1.19.4 [pip3] torch==1.7.0 [conda] Could not collect

  • PyTorch Version (e.g., 1.0): master code
  • OS (e.g., Linux): ubuntu 20.04
  • How you installed PyTorch (conda, pip, source): export USE_NINJA=1 USE_CUDA=0 USE_ROCM=1 USE_LMDB=1 USE_OPENCV=1 MAX_JOBS=2; python setup.py install
  • Build command you used (if compiling from source):export USE_NINJA=1 USE_CUDA=0 USE_ROCM=1 USE_LMDB=1 USE_OPENCV=1 MAX_JOBS=2; python setup.py install
  • Python version: 3.7
  • CUDA/cuDNN version: rocm
  • GPU models and configuration: gfx803
  • Any other relevant information:

Additional context

no.

cmal avatar Nov 08 '20 07:11 cmal

Hi @cmal , we need more details before we can help. Can you try to pull the latest ROCm PyTorch docker container and confirm if you can rebuild PyTorch from there? https://hub.docker.com/r/rocm/pytorch

sunway513 avatar Mar 01 '21 16:03 sunway513