Tobias Ribizel
Tobias Ribizel
All relevant pipelines ran through, so I'll go ahead and merge this
First, unfortunately your build will likely fail, since we don't support gfx10xx (yet), see #1429. Second, these environment variables should no longer be necessary since #1334, as long as `amdclang++`...
I think the easiest solution should be pointing [HIPCXX](https://cmake.org/cmake/help/latest/envvar/HIPCXX.html) at `amdclang++`, if it's not already in the PATH. Though it might also help just to try out a newer version...
Can you try setting `-DCMAKE_PREFIX_PATH=/opt/rocm-6.1.1` as outlined in https://rocm.docs.amd.com/en/latest/conceptual/cmake-packages.html? By choosing to install ROCm in a non-standard location like `/usr` in their packages, AMD made it slightly harder for things...
Dealing with warp size 32 requires some refactoring on our side (since we assume the warp size is known on the host at compile time, this assumption is violated in...
IIRC the ROCm clang compiler always claims the warp size is 64 from the host side regardless of the device architecture, so that will not make a difference. You could...
I've been planning on setting up a CI system with a consumer GPU for a while now, I guess this is a good time to get started ;)
If you apply the following patch and export your files using `write_binary` instead of `write`, you can store the rhs vector exactly ```patch diff --git a/benchmark/solver/solver_common.hpp b/benchmark/solver/solver_common.hpp index 19e718e08..541a9f662 100644...
This also means bumping GCC to version 7, Clang to version 5 and CUDA to version 11
@thoasm changes in std_extensions would technically break interface, so I wanted to avoid them