arrayfire-python icon indicating copy to clipboard operation
arrayfire-python copied to clipboard

runtime error in 'bench_cg.py'

Open ivan-gusachenko opened this issue 5 years ago • 0 comments

Hello, I'm getting a runtime error running the bench_cg.py example: RuntimeError: In function std::vector cuda::compileToPTX(const char*, std::string)

Also, the af.info() seems to see the GPU device, but doesn't assign it an ID, not could detect its memory and compute capability. At the same time, the bench_fft.py seems to work.

(venv3) igu@demeter:~$ nvidia-smi
Fri Apr 19 17:45:53 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.39       Driver Version: 418.39       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 980     Off  | 00000000:02:00.0 Off |                  N/A |
| 30%   42C    P0    45W / 195W |      0MiB /  4040MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
(venv3) igu@demeter:~$ export | grep LD_LILI
declare -x LD_LIBRARY_PATH=":/usr/local/cuda/lib64:/opt/arrayfire/lib64:/usr/local/cuda/nvvm/lib64"
(venv3) igu@demeter:~$ python /mnt/Simulations/IvanG/WORKSPACE/arrayfire_examples/examples/benchmarks/bench_fft.py
ArrayFire v3.6.2 (CUDA, 64-bit Linux, build dc38ef1)
Platform: CUDA Toolkit , Driver: 418.39
[] GeForce GTX 980,  MB, CUDA Compute .
Benchmark N x N 2D fft on arrayfire
Time taken for  128 x  128: 0.9280 Gflops
Time taken for  256 x  256: 174.0147 Gflops
Time taken for  512 x  512: 300.4586 Gflops
Time taken for 1024 x 1024: 298.9022 Gflops
Time taken for 2048 x 2048: 287.5326 Gflops
Time taken for 4096 x 4096: 309.7844 Gflops
Benchmark N x N 2D fft on numpy
Time taken for  128 x  128: 4.4689 Gflops
Time taken for  256 x  256: 4.8368 Gflops
Time taken for  512 x  512: 4.4340 Gflops
Time taken for 1024 x 1024: 3.8707 Gflops
(venv3) igu@demeter:~$ python /mnt/Simulations/IvanG/WORKSPACE/arrayfire_examples/examples/benchmarks/bench_cg.py
ArrayFire v3.6.2 (CUDA, 64-bit Linux, build dc38ef1)
Platform: CUDA Toolkit , Driver: 418.39
[] GeForce GTX 980,  MB, CUDA Compute .

Testing benchmark functions...
Traceback (most recent call last):
  File "/mnt/Simulations/IvanG/WORKSPACE/arrayfire_examples/examples/benchmarks/bench_cg.py", line 199, in <module>
    test()
  File "/mnt/Simulations/IvanG/WORKSPACE/arrayfire_examples/examples/benchmarks/bench_cg.py", line 141, in test
    A, b, x0 = setup_input(n=50, sparsity=7)  # dense A
  File "/mnt/Simulations/IvanG/WORKSPACE/arrayfire_examples/examples/benchmarks/bench_cg.py", line 51, in setup_input
    A = A.T + A + n*af.identity(n, n, dtype=af.Dtype.f32)
  File "/opt/venv3/lib/python3.6/site-packages/arrayfire/array.py", line 664, in T
    return transpose(self, False)
  File "/opt/venv3/lib/python3.6/site-packages/arrayfire/array.py", line 318, in transpose
    safe_call(backend.get().af_transpose(c_pointer(out.arr), a.arr, conj))
  File "/opt/venv3/lib/python3.6/site-packages/arrayfire/util.py", line 79, in safe_call
    raise RuntimeError(to_str(err_str))
RuntimeError: In function std::vector<char> cuda::compileToPTX(const char*, std::string)
In file src/backend/cuda/jit.cpp:
(venv3) igu@demeter:~$

Also, setting the backend doesn't seem to have an effect:

>>> af.get_backend()
'unified'
>>> af.set_backend('cuda')
>>> af.get_backend()
'unified'

Thank you, Ivan

ivan-gusachenko avatar Apr 19 '19 15:04 ivan-gusachenko