Isuru Fernando comments

Results 928 comments of


                                            Isuru Fernando

Segmentation fault in LLVM 10 and 11 when trying a simple SYCL reduction on an NVIDIA GPU

ah okay. since pocl changes the arguments to support local memory in CUDA, this isn't going to work. (NVIDIA's OpenCL seems to be using a non-public API to support local...

lib/CL/devices/cuda/pocl-cuda failed to compile for arm architecture

For Jetson Nano, you need https://github.com/pocl/pocl/pull/890

TBB device

I rebased this PR on top of master branch. There are 2 tests failing though.

Two tests in pocl's internal test-suite run by `ctest`. ``` The following tests FAILED: 113 - runtime/clCreateSubDevices (Failed) 157 - EinsteinToolkit_SubDev (Subprocess aborted) Errors while running CTest ``` I guess...

TBB device

Any suggestions on how to fix those tests? @pjaaskel, I looked through the suggestions, but I don't understand what to do in > In check_cmd_queue_for_device() - the for loop (DL_FOREACH)...

Low performance on trivial kernels for CUDA backend

Is pocl not using standard LLVM passes that clang is using? I see pocl producing kernels with code similar to what `clang -O1 -x cl` produces and much different to...

[CUDA] pocl on Jetson Tx2 / Segfault

CUDA backend doesn't support all the features that the pthread (CPU) backend does. Can you share `demo_float32.py` ?

[CUDA] pocl on Jetson Tx2 / Segfault

`| ERROR | /home/nvidia/.cache/pocl/kcache//program.bc does not exist!` That's an error I haven't seen before. Not sure what is going on here

[CUDA] pocl on Jetson Tx2 / Segfault

The path is wrong. `/home/nvidia/.cache/pocl/kcache//program.bc` should have been something like `/home/nvidia/.cache/pocl/kcache/AB/KMNAJOCCCKLCIDHODGOFINCNGMCALPPONOGCO/program.bc`

[CUDA] pocl on Jetson Tx2 / Segfault

I could if I had access to a Jetson, but I don't have access.