Jeff Hammond

Results 407 comments of Jeff Hammond

Thanks. I forgot about Spack. I was getting decent results with Apt and Yum but of course not for the CUDA bits.

Unfortunately, the Spack build works for CPU but not for CUDA. ```sh jrhammon@klondike:~/prk-repo/Cxx11$ ./nstream-sycl.cpu 10 1000 Parallel Research Kernels version 2.16 C++11/SYCL STREAM triad: A = B + scalar *...

Just to be clear, CUDA works on this machine: ```sh jrhammon@klondike:~/prk-repo/Cxx11$ ./nstream-cuda 10 10000000 Parallel Research Kernels version 2.16 C++11/CUDA STREAM triad: A = B + scalar * C device...

```sh jrhammon@klondike:~/prk-repo/Cxx11$ nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2017 NVIDIA Corporation Built on Fri_Nov__3_21:07:56_CDT_2017 Cuda compilation tools, release 9.1, V9.1.85 ```

Yeah, too old 😞 ``` jrhammon@klondike:~/prk-repo/common$ cat /proc/driver/nvidia/version NVRM version: NVIDIA UNIX x86_64 Kernel Module 390.132 Fri Nov 1 00:40:14 PDT 2019 GCC version: gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1) ```

> Anyway, we are in 2020 right? :-) I would argue we are actually all dead and in Hell, but I don't mind calling it 2020 🤷‍♂️

Yes, reductions are key. If you haven't already read it, the Intel proposal is [https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/reduction/reduction.md](here). There is much less use of scan but it is important for e.g. Kokkos.

Also observed with 4aea57afbd37faa241b0dfe67088e66fa11b55de ``` -- Building hipSYCL against LLVM configured from /opt/hipSYCL/llvm/lib/cmake/llvm -- Selecting clang: /opt/hipSYCL/llvm/bin/clang++ -- Using clang include directory: /opt/hipSYCL/llvm/lib/clang/9.0.1/include -- Found OpenMP_C: -fopenmp=libomp -- Found OpenMP_CXX:...

I don't know how to point the linker to /opt/rocm-3.9.0/hip/lib/libamdhip64.so when it does not use ROCM_LINK_LINE, but I worked around it by running the correct command directly. ``` [100%] Linking...