caffe2 icon indicating copy to clipboard operation
caffe2 copied to clipboard

CUDA error: no kernel image is available for execution on the device Error from operator: type: "SpatialNarrowAsGradient"

Open FduJyy opened this issue 6 years ago • 11 comments

If this is a build issue, please fill out the template below.

System information

  • Operating system: Ubuntu 16.04
  • Compiler version:
  • CMake version:
  • CMake arguments:
  • Relevant libraries/versions (e.g. CUDA): CUDA 9.0, cuDNN 7.0

I installed Caffe2 via pre-built binaries using conda install -c caffe2 caffe2-cuda9.0-cudnn7 and came across a problem. It seems that a file called "libnccl.so.2" is missing. I cloned the nccl library and compiled it but didn't find any file called "libnccl.so.2". This problem is still unsolved.

Python 3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 18:10:19) 
[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from caffe2.python import workspace
WARNING:root:This caffe2 python run does not have GPU support. Will run in CPU only mode.
WARNING:root:Debug message: libnccl.so.2: cannot open shared object file: No such file or directory

FduJyy avatar Mar 08 '18 14:03 FduJyy

Which NCCL library did you clone? This is the script we use to install the NCCL that we build against https://github.com/caffe2/caffe2/blob/master/docker/jenkins/common/install_nccl.sh . This should be the library that you need http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64/nvidia-machine-learning-repo-ubuntu1404_4.0-2_amd64.deb . If you call that script with UBUNTU_VERSION=16.04 and CUDA_VERSION=9.0 then it should install correctly

pjh5 avatar Mar 08 '18 17:03 pjh5

@pjh5 Thanks for your help! Now I can run from caffe2.python import workspace without errors. Next I tried to use the Detectron platform. However when I finished installing dependencies and ran the SpatialNarrowAsOp test, I met another problem Encountered CUDA error: no kernel image is available for execution on the device Error from operator: input: "A" input: "B" input: "C_grad" output: "A_grad" name: "" type: "SpatialNarrowAsGradient" device_option { device_type: 1 cuda_gpu_id: 0 } is_gradient_op: true. Wish to know what caused that problem?

(caffe) jyy@jyy-OptiPlex-9020:~/Detectron$ python ./tests/test_spatial_narrow_as_op.py
E0309 14:17:00.375676  3086 init_intrinsics_check.cc:59] CPU feature avx is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
E0309 14:17:00.375697  3086 init_intrinsics_check.cc:59] CPU feature avx2 is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
E0309 14:17:00.375700  3086 init_intrinsics_check.cc:59] CPU feature fma is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
Found Detectron ops lib: /home/jyy/anaconda3/envs/caffe/lib/libcaffe2_detectron_ops_gpu.so
F.E
======================================================================
ERROR: test_small_forward_and_gradient (__main__.SpatialNarrowAsOpTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "./tests/test_spatial_narrow_as_op.py", line 59, in test_small_forward_and_gradient
    self._run_test(A, B, check_grad=True)
  File "./tests/test_spatial_narrow_as_op.py", line 49, in _run_test
    res, grad, grad_estimated = gc.CheckSimple(op, [A, B], 0, [0])
  File "/home/jyy/anaconda3/envs/caffe/lib/python2.7/site-packages/caffe2/python/gradient_checker.py", line 284, in CheckSimple
    outputs_with_grads
  File "/home/jyy/anaconda3/envs/caffe/lib/python2.7/site-packages/caffe2/python/gradient_checker.py", line 201, in GetLossAndGrad
    workspace.RunOperatorsOnce(grad_ops)
  File "/home/jyy/anaconda3/envs/caffe/lib/python2.7/site-packages/caffe2/python/workspace.py", line 184, in RunOperatorsOnce
    success = RunOperatorOnce(op)
  File "/home/jyy/anaconda3/envs/caffe/lib/python2.7/site-packages/caffe2/python/workspace.py", line 179, in RunOperatorOnce
    return C.run_operator_once(StringifyProto(operator))
RuntimeError: [enforce fail at context_gpu.h:171] . Encountered CUDA error: no kernel image is available for execution on the device Error from operator: 
input: "A" input: "B" input: "C_grad" output: "A_grad" name: "" type: "SpatialNarrowAsGradient" device_option { device_type: 1 cuda_gpu_id: 0 } is_gradient_op: true

======================================================================
FAIL: test_large_forward (__main__.SpatialNarrowAsOpTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "./tests/test_spatial_narrow_as_op.py", line 68, in test_large_forward
    self._run_test(A, B)
  File "./tests/test_spatial_narrow_as_op.py", line 54, in _run_test
    np.testing.assert_allclose(C, C_ref, rtol=1e-5, atol=1e-08)
  File "/home/jyy/anaconda3/envs/caffe/lib/python2.7/site-packages/numpy/testing/nose_tools/utils.py", line 1396, in assert_allclose
    verbose=verbose, header=header, equal_nan=equal_nan)
  File "/home/jyy/anaconda3/envs/caffe/lib/python2.7/site-packages/numpy/testing/nose_tools/utils.py", line 779, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Not equal to tolerance rtol=1e-05, atol=1e-08

(mismatch 100.0%)
 x: array([[[[0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.],...
 y: array([[[[ 1.707480e+00,  1.710607e+00,  1.279160e+00, ...,
          -9.014695e-01, -1.781531e+00,  4.036736e-01],
         [ 1.895508e+00, -3.324545e-01,  3.578335e-01, ...,...

----------------------------------------------------------------------
Ran 3 tests in 0.557s

FAILED (failures=1, errors=1)

FduJyy avatar Mar 09 '18 07:03 FduJyy

@FduJyy can you try running this on CUDA 8?

@orionr should this work in CUDA 9 right now?

pjh5 avatar Mar 09 '18 17:03 pjh5

I also encountered this problem, but when I compile caffe2 from source, there is no problem.

====================================================================== ERROR: test_small_forward_and_gradient (main.SpatialNarrowAsOpTest)

Traceback (most recent call last): File "./tests/test_spatial_narrow_as_op.py", line 59, in test_small_forward_and_gradient self._run_test(A, B, check_grad=True) File "./tests/test_spatial_narrow_as_op.py", line 49, in _run_test res, grad, grad_estimated = gc.CheckSimple(op, [A, B], 0, [0]) File "/home/jyy/anaconda3/envs/caffe/lib/python2.7/site-packages/caffe2/python/gradient_checker.py", line 284, in CheckSimple outputs_with_grads File "/home/jyy/anaconda3/envs/caffe/lib/python2.7/site-packages/caffe2/python/gradient_checker.py", line 201, in GetLossAndGrad workspace.RunOperatorsOnce(grad_ops) File "/home/jyy/anaconda3/envs/caffe/lib/python2.7/site-packages/caffe2/python/workspace.py", line 184, in RunOperatorsOnce success = RunOperatorOnce(op) File "/home/jyy/anaconda3/envs/caffe/lib/python2.7/site-packages/caffe2/python/workspace.py", line 179, in RunOperatorOnce return C.run_operator_once(StringifyProto(operator)) RuntimeError: [enforce fail at context_gpu.h:171] . Encountered CUDA error: no kernel image is available for execution on the device Error from operator: input: "A" input: "B" input: "C_grad" output: "A_grad" name: "" type: "SpatialNarrowAsGradient" device_option { device_type: 1 cuda_gpu_id: 0 } is_gradient_op: true

====================================================================== FAIL: test_large_forward (main.SpatialNarrowAsOpTest)

Traceback (most recent call last): File "./tests/test_spatial_narrow_as_op.py", line 68, in test_large_forward self._run_test(A, B) File "./tests/test_spatial_narrow_as_op.py", line 54, in _run_test np.testing.assert_allclose(C, C_ref, rtol=1e-5, atol=1e-08) File "/home/jyy/anaconda3/envs/caffe/lib/python2.7/site-packages/numpy/testing/nose_tools/utils.py", line 1396, in assert_allclose verbose=verbose, header=header, equal_nan=equal_nan) File "/home/jyy/anaconda3/envs/caffe/lib/python2.7/site-packages/numpy/testing/nose_tools/utils.py", line 779, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=1e-05, atol=1e-08

NovenBae avatar Mar 13 '18 11:03 NovenBae

What does your CUDA installation look like? Can you ls -lah the folder where CUDA is installed? You can probably find it with find / -name libcuda*

pjh5 avatar Mar 14 '18 01:03 pjh5

@pjh5 You might mean my CUDA installation? It shows here.

jyy@jyy:/usr/local/cuda-9.0$ ls -lah
drwxr-xr-x 18 root root 4.0K 3月   8 22:05 .
drwxr-xr-x 13 root root 4.0K 3月   8 22:02 ..
drwxr-xr-x  3 root root 4.0K 3月   8 22:02 bin
drwxr-xr-x  5 root root 4.0K 3月   8 22:02 doc
drwxr-xr-x  5 root root 4.0K 3月   8 22:02 extras
drwxr-xr-x  5 root root 4.0K 3月   9 21:00 include
drwxr-xr-x  5 root root 4.0K 3月   8 22:02 jre
drwxr-xr-x  3 root root 4.0K 3月   9 22:59 lib64
drwxr-xr-x  8 root root 4.0K 3月   8 22:02 libnsight
drwxr-xr-x  7 root root 4.0K 3月   8 22:02 libnvvp
drwxr-xr-x  2 root root 4.0K 3月   8 22:02 nsightee_plugins
-r--r--r--  1 root root  39K 3月   8 22:59 NVIDIA_SLA_cuDNN_Support.txt
drwxr-xr-x  3 root root 4.0K 3月   8 22:02 nvml
drwxr-xr-x  7 root root 4.0K 3月   8 22:02 nvvm
drwxr-xr-x  2 root root 4.0K 3月   8 22:02 pkgconfig
drwxr-xr-x 11 root root 4.0K 3月   8 22:02 samples
drwxr-xr-x  3 root root 4.0K 3月   8 22:02 share
drwxr-xr-x  2 root root 4.0K 3月   8 22:02 src
drwxr-xr-x  2 root root 4.0K 3月   8 22:02 tools
-rw-r--r--  1 root root   21 3月   8 22:02 version.txt
jyy@jyy:/usr/local/cuda-9.0$ ls -lah lib64
drwxr-xr-x  3 root root  4.0K 3月   9 22:59 .
drwxr-xr-x 18 root root  4.0K 3月   8 22:05 ..
lrwxrwxrwx  1 root root    18 3月   8 22:02 libaccinj64.so -> libaccinj64.so.9.0
lrwxrwxrwx  1 root root    22 3月   8 22:02 libaccinj64.so.9.0 -> libaccinj64.so.9.0.176
-rwxr-xr-x  1 root root  6.6M 3月   8 22:02 libaccinj64.so.9.0.176
-rw-r--r--  1 root root   67M 3月   8 22:02 libcublas_device.a
lrwxrwxrwx  1 root root    16 3月   8 22:02 libcublas.so -> libcublas.so.9.0
lrwxrwxrwx  1 root root    20 3月   8 22:02 libcublas.so.9.0 -> libcublas.so.9.0.176
-rwxr-xr-x  1 root root   51M 3月   8 22:02 libcublas.so.9.0.176
-rw-r--r--  1 root root   57M 3月   8 22:02 libcublas_static.a
-rw-r--r--  1 root root  624K 3月   8 22:02 libcudadevrt.a
lrwxrwxrwx  1 root root    16 3月   8 22:02 libcudart.so -> libcudart.so.9.0
lrwxrwxrwx  1 root root    20 3月   8 22:02 libcudart.so.9.0 -> libcudart.so.9.0.176
-rwxr-xr-x  1 root root  433K 3月   8 22:02 libcudart.so.9.0.176
-rw-r--r--  1 root root  812K 3月   8 22:02 libcudart_static.a
-rwxr-xr-x  1 root root  306M 3月   9 22:59 libcudnn.so
-rwxr-xr-x  1 root root  306M 3月   9 22:59 libcudnn.so.7
-rwxr-xr-x  1 root root  275M 3月   9 21:00 libcudnn.so.7.0.5
-rwxr-xr-x  1 root root  306M 3月   9 22:59 libcudnn.so.7.1.1
-rw-r--r--  1 root root  302M 3月   9 23:00 libcudnn_static.a
lrwxrwxrwx  1 root root    15 3月   8 22:02 libcufft.so -> libcufft.so.9.0
lrwxrwxrwx  1 root root    19 3月   8 22:02 libcufft.so.9.0 -> libcufft.so.9.0.176
-rwxr-xr-x  1 root root  127M 3月   8 22:02 libcufft.so.9.0.176
-rw-r--r--  1 root root  131M 3月   8 22:02 libcufft_static.a
lrwxrwxrwx  1 root root    16 3月   8 22:02 libcufftw.so -> libcufftw.so.9.0
lrwxrwxrwx  1 root root    20 3月   8 22:02 libcufftw.so.9.0 -> libcufftw.so.9.0.176
-rwxr-xr-x  1 root root  496K 3月   8 22:02 libcufftw.so.9.0.176
-rw-r--r--  1 root root   41K 3月   8 22:02 libcufftw_static.a
lrwxrwxrwx  1 root root    17 3月   8 22:02 libcuinj64.so -> libcuinj64.so.9.0
lrwxrwxrwx  1 root root    21 3月   8 22:02 libcuinj64.so.9.0 -> libcuinj64.so.9.0.176
-rwxr-xr-x  1 root root  6.9M 3月   8 22:02 libcuinj64.so.9.0.176
-rw-r--r--  1 root root  1.6M 3月   8 22:02 libculibos.a
lrwxrwxrwx  1 root root    16 3月   8 22:02 libcurand.so -> libcurand.so.9.0
lrwxrwxrwx  1 root root    20 3月   8 22:02 libcurand.so.9.0 -> libcurand.so.9.0.176
-rwxr-xr-x  1 root root   57M 3月   8 22:02 libcurand.so.9.0.176
-rw-r--r--  1 root root   57M 3月   8 22:02 libcurand_static.a
lrwxrwxrwx  1 root root    18 3月   8 22:02 libcusolver.so -> libcusolver.so.9.0
lrwxrwxrwx  1 root root    22 3月   8 22:02 libcusolver.so.9.0 -> libcusolver.so.9.0.176
-rwxr-xr-x  1 root root   74M 3月   8 22:02 libcusolver.so.9.0.176
-rw-r--r--  1 root root   34M 3月   8 22:02 libcusolver_static.a
lrwxrwxrwx  1 root root    18 3月   8 22:02 libcusparse.so -> libcusparse.so.9.0
lrwxrwxrwx  1 root root    22 3月   8 22:02 libcusparse.so.9.0 -> libcusparse.so.9.0.176
-rwxr-xr-x  1 root root   54M 3月   8 22:02 libcusparse.so.9.0.176
-rw-r--r--  1 root root   62M 3月   8 22:02 libcusparse_static.a
lrwxrwxrwx  1 root root    14 3月   8 22:02 libnppc.so -> libnppc.so.9.0
lrwxrwxrwx  1 root root    18 3月   8 22:02 libnppc.so.9.0 -> libnppc.so.9.0.176
-rwxr-xr-x  1 root root  478K 3月   8 22:02 libnppc.so.9.0.176
-rw-r--r--  1 root root   24K 3月   8 22:02 libnppc_static.a
lrwxrwxrwx  1 root root    16 3月   8 22:02 libnppial.so -> libnppial.so.9.0
lrwxrwxrwx  1 root root    20 3月   8 22:02 libnppial.so.9.0 -> libnppial.so.9.0.176
-rwxr-xr-x  1 root root   11M 3月   8 22:02 libnppial.so.9.0.176
-rw-r--r--  1 root root   16M 3月   8 22:02 libnppial_static.a
lrwxrwxrwx  1 root root    16 3月   8 22:02 libnppicc.so -> libnppicc.so.9.0
lrwxrwxrwx  1 root root    20 3月   8 22:02 libnppicc.so.9.0 -> libnppicc.so.9.0.176
-rwxr-xr-x  1 root root  4.1M 3月   8 22:02 libnppicc.so.9.0.176
-rw-r--r--  1 root root  4.8M 3月   8 22:02 libnppicc_static.a
lrwxrwxrwx  1 root root    17 3月   8 22:02 libnppicom.so -> libnppicom.so.9.0
lrwxrwxrwx  1 root root    21 3月   8 22:02 libnppicom.so.9.0 -> libnppicom.so.9.0.176
-rwxr-xr-x  1 root root  1.3M 3月   8 22:02 libnppicom.so.9.0.176
-rw-r--r--  1 root root 1011K 3月   8 22:02 libnppicom_static.a
lrwxrwxrwx  1 root root    17 3月   8 22:02 libnppidei.so -> libnppidei.so.9.0
lrwxrwxrwx  1 root root    21 3月   8 22:02 libnppidei.so.9.0 -> libnppidei.so.9.0.176
-rwxr-xr-x  1 root root  7.5M 3月   8 22:02 libnppidei.so.9.0.176
-rw-r--r--  1 root root   11M 3月   8 22:02 libnppidei_static.a
lrwxrwxrwx  1 root root    15 3月   8 22:02 libnppif.so -> libnppif.so.9.0
lrwxrwxrwx  1 root root    19 3月   8 22:02 libnppif.so.9.0 -> libnppif.so.9.0.176
-rwxr-xr-x  1 root root   55M 3月   8 22:02 libnppif.so.9.0.176
-rw-r--r--  1 root root   60M 3月   8 22:02 libnppif_static.a
lrwxrwxrwx  1 root root    15 3月   8 22:02 libnppig.so -> libnppig.so.9.0
lrwxrwxrwx  1 root root    19 3月   8 22:02 libnppig.so.9.0 -> libnppig.so.9.0.176
-rwxr-xr-x  1 root root   27M 3月   8 22:02 libnppig.so.9.0.176
-rw-r--r--  1 root root   30M 3月   8 22:02 libnppig_static.a
lrwxrwxrwx  1 root root    15 3月   8 22:02 libnppim.so -> libnppim.so.9.0
lrwxrwxrwx  1 root root    19 3月   8 22:02 libnppim.so.9.0 -> libnppim.so.9.0.176
-rwxr-xr-x  1 root root  4.9M 3月   8 22:02 libnppim.so.9.0.176
-rw-r--r--  1 root root  4.9M 3月   8 22:02 libnppim_static.a
lrwxrwxrwx  1 root root    16 3月   8 22:02 libnppist.so -> libnppist.so.9.0
lrwxrwxrwx  1 root root    20 3月   8 22:02 libnppist.so.9.0 -> libnppist.so.9.0.176
-rwxr-xr-x  1 root root   15M 3月   8 22:02 libnppist.so.9.0.176
-rw-r--r--  1 root root   20M 3月   8 22:02 libnppist_static.a
lrwxrwxrwx  1 root root    16 3月   8 22:02 libnppisu.so -> libnppisu.so.9.0
lrwxrwxrwx  1 root root    20 3月   8 22:02 libnppisu.so.9.0 -> libnppisu.so.9.0.176
-rwxr-xr-x  1 root root  467K 3月   8 22:02 libnppisu.so.9.0.176
-rw-r--r--  1 root root   11K 3月   8 22:02 libnppisu_static.a
lrwxrwxrwx  1 root root    16 3月   8 22:02 libnppitc.so -> libnppitc.so.9.0
lrwxrwxrwx  1 root root    20 3月   8 22:02 libnppitc.so.9.0 -> libnppitc.so.9.0.176
-rwxr-xr-x  1 root root  2.9M 3月   8 22:02 libnppitc.so.9.0.176
-rw-r--r--  1 root root  3.9M 3月   8 22:02 libnppitc_static.a
lrwxrwxrwx  1 root root    14 3月   8 22:02 libnpps.so -> libnpps.so.9.0
lrwxrwxrwx  1 root root    18 3月   8 22:02 libnpps.so.9.0 -> libnpps.so.9.0.176
-rwxr-xr-x  1 root root  8.9M 3月   8 22:02 libnpps.so.9.0.176
-rw-r--r--  1 root root   12M 3月   8 22:02 libnpps_static.a
lrwxrwxrwx  1 root root    16 3月   8 22:02 libnvblas.so -> libnvblas.so.9.0
lrwxrwxrwx  1 root root    20 3月   8 22:02 libnvblas.so.9.0 -> libnvblas.so.9.0.176
-rwxr-xr-x  1 root root  519K 3月   8 22:02 libnvblas.so.9.0.176
lrwxrwxrwx  1 root root    17 3月   8 22:02 libnvgraph.so -> libnvgraph.so.9.0
lrwxrwxrwx  1 root root    21 3月   8 22:02 libnvgraph.so.9.0 -> libnvgraph.so.9.0.176
-rwxr-xr-x  1 root root   23M 3月   8 22:02 libnvgraph.so.9.0.176
-rw-r--r--  1 root root   53M 3月   8 22:02 libnvgraph_static.a
lrwxrwxrwx  1 root root    24 3月   8 22:02 libnvrtc-builtins.so -> libnvrtc-builtins.so.9.0
lrwxrwxrwx  1 root root    28 3月   8 22:02 libnvrtc-builtins.so.9.0 -> libnvrtc-builtins.so.9.0.176
-rwxr-xr-x  1 root root  3.2M 3月   8 22:02 libnvrtc-builtins.so.9.0.176
lrwxrwxrwx  1 root root    15 3月   8 22:02 libnvrtc.so -> libnvrtc.so.9.0
lrwxrwxrwx  1 root root    19 3月   8 22:02 libnvrtc.so.9.0 -> libnvrtc.so.9.0.176
-rwxr-xr-x  1 root root   22M 3月   8 22:02 libnvrtc.so.9.0.176
lrwxrwxrwx  1 root root    18 3月   8 22:02 libnvToolsExt.so -> libnvToolsExt.so.1
lrwxrwxrwx  1 root root    22 3月   8 22:02 libnvToolsExt.so.1 -> libnvToolsExt.so.1.0.0
-rwxr-xr-x  1 root root   37K 3月   8 22:02 libnvToolsExt.so.1.0.0
lrwxrwxrwx  1 root root    14 3月   8 22:02 libOpenCL.so -> libOpenCL.so.1
lrwxrwxrwx  1 root root    16 3月   8 22:02 libOpenCL.so.1 -> libOpenCL.so.1.0
lrwxrwxrwx  1 root root    18 3月   8 22:02 libOpenCL.so.1.0 -> libOpenCL.so.1.0.0
-rw-r--r--  1 root root   26K 3月   8 22:02 libOpenCL.so.1.0.0
drwxr-xr-x  2 root root  4.0K 3月   8 22:02 stubs

FduJyy avatar Mar 15 '18 09:03 FduJyy

Do these run if you use

CAFFE2_PYPATH=/home/jyy/anaconda3/envs/caffe/lib/python2.7/site-packages/caffe2/python/ python \
  -m pytest \
  -x \
  -v \
  -s \
  --ignore "$CAFFE2_PYPATH/test/executor_test.py" \
  --ignore "$CAFFE2_PYPATH/operator_test/matmul_op_test.py" \
  --ignore "$CAFFE2_PYPATH/operator_test/pack_ops_test.py" \
  --ignore "$CAFFE2_PYPATH/mkl/mkl_sbn_speed_test.py" \
  "$CAFFE2_PYPATH"

pjh5 avatar Mar 19 '18 21:03 pjh5

I have the same problem as @FduJyy on Detectron.

When I run your suggested tests I get this:

============================= test session starts ============================= platform linux2 -- Python 2.7.14, pytest-3.5.0, py-1.5.3, pluggy-0.6.0 -- /home/joaofayad/anaconda3/envs/detectron/bin/python cachedir: .pytest_cache rootdir: /home/joaofayad/detectron, inifile: collected 34 items

lib/core/test_engine.py::test_net_on_dataset ERROR [ 2%]

=================================== ERRORS ==================================== ____________________ ERROR at setup of test_net_on_dataset ____________________ file /home/joaofayad/detectron/lib/core/test_engine.py, line 126 def test_net_on_dataset( E fixture 'weights_file' not found > available fixtures: cache, capfd, capfdbinary, caplog, capsys, capsysbinary, doctest_namespace, monkeypatch, pytestconfig, record_property, record_xml_attribute, record_xml_property, recwarn, tmpdir, tmpdir_factory > use 'pytest --fixtures [testpath]' for help on them.

joaofayad avatar Mar 28 '18 19:03 joaofayad

@pjh5 @joaofayad Sorry for replying late. Thanks to @NovenBae's advice, I solved this problem by compiling caffe2 from source using conda build (referring to the official website). Now everything is OK and Detectron runs well.

FduJyy avatar Mar 29 '18 14:03 FduJyy

@pjh5 @joaofayad @FduJyy I am having the same problem. I have installed my caffe2 with the pre-bild binaries code. Everything is working and the GPU test is returning 1. And when I run the detectron installation test, I am encountering this same FAILED (failures=1, errors=1) error. I have used CUDA8 and cuDNN7 for the installation. When I run pjh5's test, they failed with ImportError. I am using an Azure DSVM, and the X2Go interface is not letting me copy-paste. So I took a snippet of the screen: image Now, I am going to try reinstalling the caffe2 from the main website (build from source). @FduJyy was this what you meant? Without using the conda install -c caffe2 caffe2-cuda8.0-cudnn7 and instead using a list of pip install commands? Thank you!

BanuSelinTosun avatar Jul 10 '18 20:07 BanuSelinTosun

@FduJyy I met the exact same problem and was able to solve it with your method. Thanks!

zwangab91 avatar Aug 02 '18 00:08 zwangab91