pika
pika copied to clipboard
Investigate why llvm-amdgpu's clang++ is hanging on compilation synchronize.cu
The compilation of synchronize.cu is very slow to compile, this part of the code is the problem: https://github.com/pika-org/pika/blob/8e6ac7e251a75e5e468de7c987f2f1d074151943/libs/pika/async_cuda/tests/performance/synchronize.cu#L150-L168
A minimal version of that test limiting the loop unrolling to only 6 times ex::transfer(sched) | cu::then_with_stream(f) | ex::transfer(ex::thread_pool_scheduler{}) | ex::then([] {})
compiles in more than 8 min. 7 iterations in more than 14 min.
Partially disabled in #510