cccl icon indicating copy to clipboard operation
cccl copied to clipboard

make `run_loop` lock-free and usable from device code

Open ericniebler opened this issue 8 months ago • 8 comments

Description

the current implementation of run_loop uses std::mutex and std::condition_variable. these types have no analogue in cuda::std::. as a result, run_loop has been limited to host-only, and hence so has the sync_wait algorithm.

in a future PR i want to add a stream-aware sync_wait that drives a run_loop from a host thread, and call run_loop::finish() from a SIMT thread. for that, i need a host/device run_loop, and it must still be thread-safe.

stdexec has an __atomic_intrusive_queue type that has seen heavy use. this PR slurps in that queue type and implements run_loop on top of it, delegating the atomic operations to cuda::std::atomic. it uses atomic::wait and atomic::notify_all to coordinate between threads.

the end result is a run_loop that can be used both from device and from host.

Checklist

  • [x] New or existing tests cover these changes.
  • [x] The documentation is up to date with these changes.

ericniebler avatar Apr 22 '25 21:04 ericniebler

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

copy-pr-bot[bot] avatar Apr 22 '25 21:04 copy-pr-bot[bot]

🟨 CI finished in 22m 16s: Pass: 84%/26 | Total: 2h 31m | Avg: 5m 50s | Max: 14m 07s | Hits: 95%/12316
  • 🟨 cudax: Pass: 84%/26 | Total: 2h 31m | Avg: 5m 50s | Max: 14m 07s | Hits: 95%/12316

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  81%/22  | Total:  2h 15m | Avg:  6m 08s | Max: 14m 07s | Hits:  95%/9968  
      🟩 arm64              Pass: 100%/4   | Total: 16m 42s | Avg:  4m 10s | Max:  5m 12s | Hits:  93%/2348  
    🔍 cxx_family: GCC 🔍
      🟩 Clang              Pass: 100%/10  | Total: 44m 56s | Avg:  4m 29s | Max: 11m 54s | Hits:  98%/5874  
      🔍 GCC                Pass:  66%/12  | Total:  1h 07m | Avg:  5m 38s | Max: 14m 07s | Hits:  91%/4696  
      🟩 MSVC               Pass: 100%/2   | Total: 22m 52s | Avg: 11m 26s | Max: 11m 40s | Hits:  91%/576   
      🟩 NVHPC              Pass: 100%/2   | Total: 16m 23s | Avg:  8m 11s | Max:  8m 20s | Hits:  96%/1170  
    🔍 gpu: rtx2080 🔍
      🟩 h100               Pass: 100%/2   | Total: 18m 33s | Avg:  9m 16s | Max: 14m 07s | Hits:  94%/1174  
      🔍 rtx2080            Pass:  83%/24  | Total:  2h 13m | Avg:  5m 33s | Max: 13m 07s | Hits:  95%/11142 
    🔍 jobs: Build 🔍
      🔍 Build              Pass:  82%/23  | Total:  1h 52m | Avg:  4m 54s | Max: 11m 40s | Hits:  94%/10555 
      🟩 Test               Pass: 100%/3   | Total: 39m 08s | Avg: 13m 02s | Max: 14m 07s | Hits:  99%/1761  
    🔍 std: 20 🔍
      🟩 17                 Pass: 100%/4   | Total: 20m 29s | Avg:  5m 07s | Max:  8m 03s | Hits:  93%/2346  
      🔍 20                 Pass:  81%/22  | Total:  2h 11m | Avg:  5m 58s | Max: 14m 07s | Hits:  95%/9970  
    🟨 cxx
      🟩 Clang14            Pass: 100%/2   | Total:  7m 09s | Avg:  3m 34s | Max:  3m 48s | Hits:  98%/1178  
      🟩 Clang15            Pass: 100%/1   | Total:  3m 49s | Avg:  3m 49s | Max:  3m 49s | Hits:  98%/587   
      🟩 Clang16            Pass: 100%/1   | Total:  3m 49s | Avg:  3m 49s | Max:  3m 49s | Hits:  98%/587   
      🟩 Clang17            Pass: 100%/1   | Total:  3m 48s | Avg:  3m 48s | Max:  3m 48s | Hits:  98%/587   
      🟩 Clang18            Pass: 100%/1   | Total:  3m 52s | Avg:  3m 52s | Max:  3m 52s | Hits:  98%/587   
      🟩 Clang19            Pass: 100%/4   | Total: 22m 29s | Avg:  5m 37s | Max: 11m 54s | Hits:  98%/2348  
      🟥 GCC10              Pass:   0%/2   | Total:  5m 35s | Avg:  2m 47s | Max:  2m 55s
      🟥 GCC11              Pass:   0%/1   | Total:  2m 54s | Avg:  2m 54s | Max:  2m 54s
      🟥 GCC12              Pass:   0%/1   | Total:  2m 55s | Avg:  2m 55s | Max:  2m 55s
      🟩 GCC13              Pass: 100%/8   | Total: 56m 22s | Avg:  7m 02s | Max: 14m 07s | Hits:  91%/4696  
      🟩 MSVC14.39          Pass: 100%/1   | Total: 11m 40s | Avg: 11m 40s | Max: 11m 40s | Hits:  92%/288   
      🟩 MSVC14.42          Pass: 100%/1   | Total: 11m 12s | Avg: 11m 12s | Max: 11m 12s | Hits:  91%/288   
      🟩 NVHPC25.3          Pass: 100%/2   | Total: 16m 23s | Avg:  8m 11s | Max:  8m 20s | Hits:  96%/1170  
    🟨 cudacxx_family
      🟨 nvcc               Pass:  84%/26  | Total:  2h 31m | Avg:  5m 50s | Max: 14m 07s | Hits:  95%/12316 
    🟨 ctk
      🟨 12.0               Pass:  66%/3   | Total: 17m 41s | Avg:  5m 53s | Max: 11m 40s | Hits:  96%/877   
      🟨 12.8               Pass:  86%/23  | Total:  2h 14m | Avg:  5m 50s | Max: 14m 07s | Hits:  95%/11439 
    🟨 cudacxx
      🟨 nvcc12.0           Pass:  66%/3   | Total: 17m 41s | Avg:  5m 53s | Max: 11m 40s | Hits:  96%/877   
      🟨 nvcc12.8           Pass:  86%/23  | Total:  2h 14m | Avg:  5m 50s | Max: 14m 07s | Hits:  95%/11439 
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 22m 50s | Avg:  7m 36s | Max: 14m 07s | Hits:  92%/1761  
      🟩 90a                Pass: 100%/1   | Total:  4m 25s | Avg:  4m 25s | Max:  4m 25s | Hits:  88%/587   
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
stdpar
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
stdpar
python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 26)

# Runner
17 linux-amd64-cpu16
4 linux-arm64-cpu16
2 windows-amd64-cpu16
2 linux-amd64-gpu-rtx2080-latest-1
1 linux-amd64-gpu-h100-latest-1

github-actions[bot] avatar Apr 23 '25 02:04 github-actions[bot]

/ok to test 361ffc6

ericniebler avatar Apr 23 '25 02:04 ericniebler

🟩 CI finished in 20m 24s: Pass: 100%/26 | Total: 2h 25m | Avg: 5m 34s | Max: 14m 05s | Hits: 97%/14668
  • 🟩 cudax: Pass: 100%/26 | Total: 2h 25m | Avg: 5m 34s | Max: 14m 05s | Hits: 97%/14668

    🟩 cpu
      🟩 amd64              Pass: 100%/22  | Total:  2h 11m | Avg:  5m 58s | Max: 14m 05s | Hits:  97%/12320 
      🟩 arm64              Pass: 100%/4   | Total: 13m 40s | Avg:  3m 25s | Max:  3m 34s | Hits:  98%/2348  
    🟩 ctk
      🟩 12.0               Pass: 100%/3   | Total: 17m 49s | Avg:  5m 56s | Max: 11m 14s | Hits:  96%/1466  
      🟩 12.8               Pass: 100%/23  | Total:  2h 07m | Avg:  5m 31s | Max: 14m 05s | Hits:  98%/13202 
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/3   | Total: 17m 49s | Avg:  5m 56s | Max: 11m 14s | Hits:  96%/1466  
      🟩 nvcc12.8           Pass: 100%/23  | Total:  2h 07m | Avg:  5m 31s | Max: 14m 05s | Hits:  98%/13202 
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/26  | Total:  2h 25m | Avg:  5m 34s | Max: 14m 05s | Hits:  97%/14668 
    🟩 cxx
      🟩 Clang14            Pass: 100%/2   | Total:  6m 52s | Avg:  3m 26s | Max:  3m 32s | Hits:  98%/1178  
      🟩 Clang15            Pass: 100%/1   | Total:  3m 50s | Avg:  3m 50s | Max:  3m 50s | Hits:  98%/587   
      🟩 Clang16            Pass: 100%/1   | Total:  4m 01s | Avg:  4m 01s | Max:  4m 01s | Hits:  98%/587   
      🟩 Clang17            Pass: 100%/1   | Total:  3m 44s | Avg:  3m 44s | Max:  3m 44s | Hits:  98%/587   
      🟩 Clang18            Pass: 100%/1   | Total:  3m 44s | Avg:  3m 44s | Max:  3m 44s | Hits:  98%/587   
      🟩 Clang19            Pass: 100%/4   | Total: 22m 39s | Avg:  5m 39s | Max: 11m 56s | Hits:  98%/2348  
      🟩 GCC10              Pass: 100%/2   | Total:  6m 47s | Avg:  3m 23s | Max:  3m 32s | Hits:  97%/1178  
      🟩 GCC11              Pass: 100%/1   | Total:  3m 44s | Avg:  3m 44s | Max:  3m 44s | Hits:  97%/587   
      🟩 GCC12              Pass: 100%/1   | Total:  3m 49s | Avg:  3m 49s | Max:  3m 49s | Hits:  97%/587   
      🟩 GCC13              Pass: 100%/8   | Total: 47m 39s | Avg:  5m 57s | Max: 14m 05s | Hits:  98%/4696  
      🟩 MSVC14.39          Pass: 100%/1   | Total: 11m 14s | Avg: 11m 14s | Max: 11m 14s | Hits:  91%/288   
      🟩 MSVC14.42          Pass: 100%/1   | Total: 10m 33s | Avg: 10m 33s | Max: 10m 33s | Hits:  91%/288   
      🟩 NVHPC25.3          Pass: 100%/2   | Total: 16m 24s | Avg:  8m 12s | Max:  8m 15s | Hits:  95%/1170  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/10  | Total: 44m 50s | Avg:  4m 29s | Max: 11m 56s | Hits:  98%/5874  
      🟩 GCC                Pass: 100%/12  | Total:  1h 01m | Avg:  5m 09s | Max: 14m 05s | Hits:  98%/7048  
      🟩 MSVC               Pass: 100%/2   | Total: 21m 47s | Avg: 10m 53s | Max: 11m 14s | Hits:  91%/576   
      🟩 NVHPC              Pass: 100%/2   | Total: 16m 24s | Avg:  8m 12s | Max:  8m 15s | Hits:  95%/1170  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 15m 02s | Avg:  7m 31s | Max: 11m 41s | Hits:  98%/1174  
      🟩 rtx2080            Pass: 100%/24  | Total:  2h 09m | Avg:  5m 24s | Max: 14m 05s | Hits:  97%/13494 
    🟩 jobs
      🟩 Build              Pass: 100%/23  | Total:  1h 47m | Avg:  4m 39s | Max: 11m 14s | Hits:  97%/12907 
      🟩 Test               Pass: 100%/3   | Total: 37m 42s | Avg: 12m 34s | Max: 14m 05s | Hits:  99%/1761  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 18m 37s | Avg:  6m 12s | Max: 11m 41s | Hits:  98%/1761  
      🟩 90a                Pass: 100%/1   | Total:  3m 28s | Avg:  3m 28s | Max:  3m 28s | Hits:  97%/587   
    🟩 std
      🟩 17                 Pass: 100%/4   | Total: 18m 36s | Avg:  4m 39s | Max:  8m 15s | Hits:  97%/2346  
      🟩 20                 Pass: 100%/22  | Total:  2h 06m | Avg:  5m 44s | Max: 14m 05s | Hits:  97%/12322 
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
stdpar
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
stdpar
python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 26)

# Runner
17 linux-amd64-cpu16
4 linux-arm64-cpu16
2 windows-amd64-cpu16
2 linux-amd64-gpu-rtx2080-latest-1
1 linux-amd64-gpu-h100-latest-1

github-actions[bot] avatar Apr 23 '25 03:04 github-actions[bot]

🟨 CI finished in 1h 49m: Pass: 96%/26 | Total: 2h 26m | Avg: 5m 37s | Max: 14m 11s | Hits: 97%/14380
  • 🟨 cudax: Pass: 96%/26 | Total: 2h 26m | Avg: 5m 37s | Max: 14m 11s | Hits: 97%/14380

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  95%/22  | Total:  2h 12m | Avg:  6m 02s | Max: 14m 11s | Hits:  97%/12032 
      🟩 arm64              Pass: 100%/4   | Total: 13m 21s | Avg:  3m 20s | Max:  3m 34s | Hits:  97%/2348  
    🔍 ctk: 12.8 🔍
      🟩 12.0               Pass: 100%/3   | Total: 18m 10s | Avg:  6m 03s | Max: 11m 13s | Hits:  96%/1466  
      🔍 12.8               Pass:  95%/23  | Total:  2h 07m | Avg:  5m 33s | Max: 14m 11s | Hits:  97%/12914 
    🔍 cudacxx: nvcc12.8 🔍
      🟩 nvcc12.0           Pass: 100%/3   | Total: 18m 10s | Avg:  6m 03s | Max: 11m 13s | Hits:  96%/1466  
      🔍 nvcc12.8           Pass:  95%/23  | Total:  2h 07m | Avg:  5m 33s | Max: 14m 11s | Hits:  97%/12914 
    🚨 cxx: MSVC14.42 🚨
      🟩 Clang14            Pass: 100%/2   | Total:  7m 13s | Avg:  3m 36s | Max:  3m 41s | Hits:  98%/1178  
      🟩 Clang15            Pass: 100%/1   | Total:  3m 44s | Avg:  3m 44s | Max:  3m 44s | Hits:  98%/587   
      🟩 Clang16            Pass: 100%/1   | Total:  3m 42s | Avg:  3m 42s | Max:  3m 42s | Hits:  98%/587   
      🟩 Clang17            Pass: 100%/1   | Total:  4m 00s | Avg:  4m 00s | Max:  4m 00s | Hits:  98%/587   
      🟩 Clang18            Pass: 100%/1   | Total:  3m 57s | Avg:  3m 57s | Max:  3m 57s | Hits:  98%/587   
      🟩 Clang19            Pass: 100%/4   | Total: 22m 08s | Avg:  5m 32s | Max: 12m 00s | Hits:  98%/2348  
      🟩 GCC10              Pass: 100%/2   | Total:  7m 00s | Avg:  3m 30s | Max:  3m 35s | Hits:  97%/1178  
      🟩 GCC11              Pass: 100%/1   | Total:  3m 45s | Avg:  3m 45s | Max:  3m 45s | Hits:  97%/587   
      🟩 GCC12              Pass: 100%/1   | Total:  3m 55s | Avg:  3m 55s | Max:  3m 55s | Hits:  97%/587   
      🟩 GCC13              Pass: 100%/8   | Total: 48m 40s | Avg:  6m 05s | Max: 14m 11s | Hits:  98%/4696  
      🟩 MSVC14.39          Pass: 100%/1   | Total: 11m 13s | Avg: 11m 13s | Max: 11m 13s | Hits:  90%/288   
      🔥 MSVC14.42          Pass:   0%/1   | Total: 11m 04s | Avg: 11m 04s | Max: 11m 04s
      🟩 NVHPC25.3          Pass: 100%/2   | Total: 15m 44s | Avg:  7m 52s | Max:  8m 00s | Hits:  95%/1170  
    🔍 cxx_family: MSVC 🔍
      🟩 Clang              Pass: 100%/10  | Total: 44m 44s | Avg:  4m 28s | Max: 12m 00s | Hits:  98%/5874  
      🟩 GCC                Pass: 100%/12  | Total:  1h 03m | Avg:  5m 16s | Max: 14m 11s | Hits:  98%/7048  
      🔍 MSVC               Pass:  50%/2   | Total: 22m 17s | Avg: 11m 08s | Max: 11m 13s | Hits:  90%/288   
      🟩 NVHPC              Pass: 100%/2   | Total: 15m 44s | Avg:  7m 52s | Max:  8m 00s | Hits:  95%/1170  
    🔍 gpu: rtx2080 🔍
      🟩 h100               Pass: 100%/2   | Total: 17m 30s | Avg:  8m 45s | Max: 14m 11s | Hits:  98%/1174  
      🔍 rtx2080            Pass:  95%/24  | Total:  2h 08m | Avg:  5m 21s | Max: 13m 07s | Hits:  97%/13206 
    🔍 jobs: Build 🔍
      🔍 Build              Pass:  95%/23  | Total:  1h 46m | Avg:  4m 38s | Max: 11m 13s | Hits:  97%/12619 
      🟩 Test               Pass: 100%/3   | Total: 39m 18s | Avg: 13m 06s | Max: 14m 11s | Hits:  99%/1761  
    🔍 std: 20 🔍
      🟩 17                 Pass: 100%/4   | Total: 18m 07s | Avg:  4m 31s | Max:  8m 00s | Hits:  97%/2346  
      🔍 20                 Pass:  95%/22  | Total:  2h 07m | Avg:  5m 49s | Max: 14m 11s | Hits:  97%/12034 
    🟨 cudacxx_family
      🟨 nvcc               Pass:  96%/26  | Total:  2h 26m | Avg:  5m 37s | Max: 14m 11s | Hits:  97%/14380 
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 21m 01s | Avg:  7m 00s | Max: 14m 11s | Hits:  98%/1761  
      🟩 90a                Pass: 100%/1   | Total:  3m 26s | Avg:  3m 26s | Max:  3m 26s | Hits:  97%/587   
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
stdpar
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
stdpar
python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 26)

# Runner
17 linux-amd64-cpu16
4 linux-arm64-cpu16
2 windows-amd64-cpu16
2 linux-amd64-gpu-rtx2080-latest-1
1 linux-amd64-gpu-h100-latest-1

github-actions[bot] avatar Apr 23 '25 16:04 github-actions[bot]

🟩 CI finished in 2h 54m: Pass: 100%/26 | Total: 2h 34m | Avg: 5m 56s | Max: 20m 23s | Hits: 97%/14668
  • 🟩 cudax: Pass: 100%/26 | Total: 2h 34m | Avg: 5m 56s | Max: 20m 23s | Hits: 97%/14668

    🟩 cpu
      🟩 amd64              Pass: 100%/22  | Total:  2h 20m | Avg:  6m 23s | Max: 20m 23s | Hits:  97%/12320 
      🟩 arm64              Pass: 100%/4   | Total: 13m 35s | Avg:  3m 23s | Max:  3m 37s | Hits:  97%/2348  
    🟩 ctk
      🟩 12.0               Pass: 100%/3   | Total: 19m 12s | Avg:  6m 24s | Max: 12m 08s | Hits:  96%/1466  
      🟩 12.8               Pass: 100%/23  | Total:  2h 15m | Avg:  5m 52s | Max: 20m 23s | Hits:  97%/13202 
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/3   | Total: 19m 12s | Avg:  6m 24s | Max: 12m 08s | Hits:  96%/1466  
      🟩 nvcc12.8           Pass: 100%/23  | Total:  2h 15m | Avg:  5m 52s | Max: 20m 23s | Hits:  97%/13202 
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/26  | Total:  2h 34m | Avg:  5m 56s | Max: 20m 23s | Hits:  97%/14668 
    🟩 cxx
      🟩 Clang14            Pass: 100%/2   | Total:  7m 23s | Avg:  3m 41s | Max:  3m 56s | Hits:  98%/1178  
      🟩 Clang15            Pass: 100%/1   | Total:  3m 48s | Avg:  3m 48s | Max:  3m 48s | Hits:  98%/587   
      🟩 Clang16            Pass: 100%/1   | Total:  3m 48s | Avg:  3m 48s | Max:  3m 48s | Hits:  98%/587   
      🟩 Clang17            Pass: 100%/1   | Total:  3m 47s | Avg:  3m 47s | Max:  3m 47s | Hits:  98%/587   
      🟩 Clang18            Pass: 100%/1   | Total:  3m 47s | Avg:  3m 47s | Max:  3m 47s | Hits:  98%/587   
      🟩 Clang19            Pass: 100%/4   | Total: 24m 20s | Avg:  6m 05s | Max: 13m 42s | Hits:  98%/2348  
      🟩 GCC10              Pass: 100%/2   | Total:  7m 10s | Avg:  3m 35s | Max:  3m 37s | Hits:  97%/1178  
      🟩 GCC11              Pass: 100%/1   | Total:  3m 52s | Avg:  3m 52s | Max:  3m 52s | Hits:  97%/587   
      🟩 GCC12              Pass: 100%/1   | Total:  3m 48s | Avg:  3m 48s | Max:  3m 48s | Hits:  97%/587   
      🟩 GCC13              Pass: 100%/8   | Total: 52m 48s | Avg:  6m 36s | Max: 20m 23s | Hits:  98%/4696  
      🟩 MSVC14.39          Pass: 100%/1   | Total: 12m 08s | Avg: 12m 08s | Max: 12m 08s | Hits:  90%/288   
      🟩 MSVC14.42          Pass: 100%/1   | Total: 11m 48s | Avg: 11m 48s | Max: 11m 48s | Hits:  90%/288   
      🟩 NVHPC25.3          Pass: 100%/2   | Total: 15m 53s | Avg:  7m 56s | Max:  7m 59s | Hits:  95%/1170  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/10  | Total: 46m 53s | Avg:  4m 41s | Max: 13m 42s | Hits:  98%/5874  
      🟩 GCC                Pass: 100%/12  | Total:  1h 07m | Avg:  5m 38s | Max: 20m 23s | Hits:  98%/7048  
      🟩 MSVC               Pass: 100%/2   | Total: 23m 56s | Avg: 11m 58s | Max: 12m 08s | Hits:  90%/576   
      🟩 NVHPC              Pass: 100%/2   | Total: 15m 53s | Avg:  7m 56s | Max:  7m 59s | Hits:  95%/1170  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 15m 00s | Avg:  7m 30s | Max: 11m 37s | Hits:  98%/1174  
      🟩 rtx2080            Pass: 100%/24  | Total:  2h 19m | Avg:  5m 48s | Max: 20m 23s | Hits:  97%/13494 
    🟩 jobs
      🟩 Build              Pass: 100%/23  | Total:  1h 48m | Avg:  4m 43s | Max: 12m 08s | Hits:  97%/12907 
      🟩 Test               Pass: 100%/3   | Total: 45m 42s | Avg: 15m 14s | Max: 20m 23s | Hits:  99%/1761  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 18m 19s | Avg:  6m 06s | Max: 11m 37s | Hits:  98%/1761  
      🟩 90a                Pass: 100%/1   | Total:  3m 14s | Avg:  3m 14s | Max:  3m 14s | Hits:  97%/587   
    🟩 std
      🟩 17                 Pass: 100%/4   | Total: 17m 58s | Avg:  4m 29s | Max:  7m 59s | Hits:  97%/2346  
      🟩 20                 Pass: 100%/22  | Total:  2h 16m | Avg:  6m 11s | Max: 20m 23s | Hits:  97%/12322 
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
stdpar
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
stdpar
python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 26)

# Runner
17 linux-amd64-cpu16
4 linux-arm64-cpu16
2 windows-amd64-cpu16
2 linux-amd64-gpu-rtx2080-latest-1
1 linux-amd64-gpu-h100-latest-1

github-actions[bot] avatar Apr 23 '25 21:04 github-actions[bot]

🟩 CI finished in 2h 06m: Pass: 100%/26 | Total: 2h 31m | Avg: 5m 48s | Max: 19m 48s | Hits: 97%/14668
  • 🟩 cudax: Pass: 100%/26 | Total: 2h 31m | Avg: 5m 48s | Max: 19m 48s | Hits: 97%/14668

    🟩 cpu
      🟩 amd64              Pass: 100%/22  | Total:  2h 17m | Avg:  6m 15s | Max: 19m 48s | Hits:  97%/12320 
      🟩 arm64              Pass: 100%/4   | Total: 13m 29s | Avg:  3m 22s | Max:  3m 32s | Hits:  97%/2348  
    🟩 ctk
      🟩 12.0               Pass: 100%/3   | Total: 18m 45s | Avg:  6m 15s | Max: 11m 41s | Hits:  96%/1466  
      🟩 12.8               Pass: 100%/23  | Total:  2h 12m | Avg:  5m 45s | Max: 19m 48s | Hits:  97%/13202 
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/3   | Total: 18m 45s | Avg:  6m 15s | Max: 11m 41s | Hits:  96%/1466  
      🟩 nvcc12.8           Pass: 100%/23  | Total:  2h 12m | Avg:  5m 45s | Max: 19m 48s | Hits:  97%/13202 
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/26  | Total:  2h 31m | Avg:  5m 48s | Max: 19m 48s | Hits:  97%/14668 
    🟩 cxx
      🟩 Clang14            Pass: 100%/2   | Total:  7m 09s | Avg:  3m 34s | Max:  3m 43s | Hits:  98%/1178  
      🟩 Clang15            Pass: 100%/1   | Total:  3m 44s | Avg:  3m 44s | Max:  3m 44s | Hits:  98%/587   
      🟩 Clang16            Pass: 100%/1   | Total:  3m 48s | Avg:  3m 48s | Max:  3m 48s | Hits:  98%/587   
      🟩 Clang17            Pass: 100%/1   | Total:  3m 53s | Avg:  3m 53s | Max:  3m 53s | Hits:  98%/587   
      🟩 Clang18            Pass: 100%/1   | Total:  3m 49s | Avg:  3m 49s | Max:  3m 49s | Hits:  98%/587   
      🟩 Clang19            Pass: 100%/4   | Total: 22m 35s | Avg:  5m 38s | Max: 11m 59s | Hits:  98%/2348  
      🟩 GCC10              Pass: 100%/2   | Total:  7m 11s | Avg:  3m 35s | Max:  3m 38s | Hits:  97%/1178  
      🟩 GCC11              Pass: 100%/1   | Total:  3m 43s | Avg:  3m 43s | Max:  3m 43s | Hits:  97%/587   
      🟩 GCC12              Pass: 100%/1   | Total:  3m 53s | Avg:  3m 53s | Max:  3m 53s | Hits:  97%/587   
      🟩 GCC13              Pass: 100%/8   | Total: 52m 48s | Avg:  6m 36s | Max: 19m 48s | Hits:  98%/4696  
      🟩 MSVC14.39          Pass: 100%/1   | Total: 11m 41s | Avg: 11m 41s | Max: 11m 41s | Hits:  90%/288   
      🟩 MSVC14.42          Pass: 100%/1   | Total: 10m 52s | Avg: 10m 52s | Max: 10m 52s | Hits:  90%/288   
      🟩 NVHPC25.3          Pass: 100%/2   | Total: 16m 03s | Avg:  8m 01s | Max:  8m 15s | Hits:  95%/1170  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/10  | Total: 44m 58s | Avg:  4m 29s | Max: 11m 59s | Hits:  98%/5874  
      🟩 GCC                Pass: 100%/12  | Total:  1h 07m | Avg:  5m 37s | Max: 19m 48s | Hits:  98%/7048  
      🟩 MSVC               Pass: 100%/2   | Total: 22m 33s | Avg: 11m 16s | Max: 11m 41s | Hits:  90%/576   
      🟩 NVHPC              Pass: 100%/2   | Total: 16m 03s | Avg:  8m 01s | Max:  8m 15s | Hits:  95%/1170  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 15m 18s | Avg:  7m 39s | Max: 11m 51s | Hits:  98%/1174  
      🟩 rtx2080            Pass: 100%/24  | Total:  2h 15m | Avg:  5m 39s | Max: 19m 48s | Hits:  97%/13494 
    🟩 jobs
      🟩 Build              Pass: 100%/23  | Total:  1h 47m | Avg:  4m 40s | Max: 11m 41s | Hits:  97%/12907 
      🟩 Test               Pass: 100%/3   | Total: 43m 38s | Avg: 14m 32s | Max: 19m 48s | Hits:  99%/1761  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 18m 37s | Avg:  6m 12s | Max: 11m 51s | Hits:  98%/1761  
      🟩 90a                Pass: 100%/1   | Total:  3m 17s | Avg:  3m 17s | Max:  3m 17s | Hits:  97%/587   
    🟩 std
      🟩 17                 Pass: 100%/4   | Total: 17m 49s | Avg:  4m 27s | Max:  7m 48s | Hits:  97%/2346  
      🟩 20                 Pass: 100%/22  | Total:  2h 13m | Avg:  6m 03s | Max: 19m 48s | Hits:  97%/12322 
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
stdpar
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
stdpar
python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 26)

# Runner
17 linux-amd64-cpu16
4 linux-arm64-cpu16
2 windows-amd64-cpu16
2 linux-amd64-gpu-rtx2080-latest-1
1 linux-amd64-gpu-h100-latest-1

github-actions[bot] avatar Apr 25 '25 01:04 github-actions[bot]

🟩 CI finished in 44m 03s: Pass: 100%/26 | Total: 2h 29m | Avg: 5m 45s | Max: 14m 43s | Hits: 98%/14668
  • 🟩 cudax: Pass: 100%/26 | Total: 2h 29m | Avg: 5m 45s | Max: 14m 43s | Hits: 98%/14668

    🟩 cpu
      🟩 amd64              Pass: 100%/22  | Total:  2h 16m | Avg:  6m 11s | Max: 14m 43s | Hits:  98%/12320 
      🟩 arm64              Pass: 100%/4   | Total: 13m 32s | Avg:  3m 23s | Max:  3m 27s | Hits:  98%/2348  
    🟩 ctk
      🟩 12.0               Pass: 100%/3   | Total: 17m 37s | Avg:  5m 52s | Max: 10m 51s | Hits:  97%/1466  
      🟩 12.8               Pass: 100%/23  | Total:  2h 12m | Avg:  5m 44s | Max: 14m 43s | Hits:  98%/13202 
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/3   | Total: 17m 37s | Avg:  5m 52s | Max: 10m 51s | Hits:  97%/1466  
      🟩 nvcc12.8           Pass: 100%/23  | Total:  2h 12m | Avg:  5m 44s | Max: 14m 43s | Hits:  98%/13202 
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/26  | Total:  2h 29m | Avg:  5m 45s | Max: 14m 43s | Hits:  98%/14668 
    🟩 cxx
      🟩 Clang14            Pass: 100%/2   | Total:  7m 10s | Avg:  3m 35s | Max:  3m 48s | Hits:  99%/1178  
      🟩 Clang15            Pass: 100%/1   | Total:  3m 40s | Avg:  3m 40s | Max:  3m 40s | Hits:  98%/587   
      🟩 Clang16            Pass: 100%/1   | Total:  3m 50s | Avg:  3m 50s | Max:  3m 50s | Hits:  99%/587   
      🟩 Clang17            Pass: 100%/1   | Total:  3m 51s | Avg:  3m 51s | Max:  3m 51s | Hits:  98%/587   
      🟩 Clang18            Pass: 100%/1   | Total:  4m 03s | Avg:  4m 03s | Max:  4m 03s | Hits:  98%/587   
      🟩 Clang19            Pass: 100%/4   | Total: 23m 50s | Avg:  5m 57s | Max: 13m 28s | Hits:  99%/2348  
      🟩 GCC10              Pass: 100%/2   | Total:  7m 25s | Avg:  3m 42s | Max:  4m 01s | Hits:  98%/1178  
      🟩 GCC11              Pass: 100%/1   | Total:  3m 51s | Avg:  3m 51s | Max:  3m 51s | Hits:  98%/587   
      🟩 GCC12              Pass: 100%/1   | Total:  3m 58s | Avg:  3m 58s | Max:  3m 58s | Hits:  98%/587   
      🟩 GCC13              Pass: 100%/8   | Total: 49m 27s | Avg:  6m 10s | Max: 14m 43s | Hits:  98%/4696  
      🟩 MSVC14.39          Pass: 100%/1   | Total: 10m 51s | Avg: 10m 51s | Max: 10m 51s | Hits:  93%/288   
      🟩 MSVC14.42          Pass: 100%/1   | Total: 11m 02s | Avg: 11m 02s | Max: 11m 02s | Hits:  93%/288   
      🟩 NVHPC25.3          Pass: 100%/2   | Total: 16m 40s | Avg:  8m 20s | Max:  8m 43s | Hits:  96%/1170  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/10  | Total: 46m 24s | Avg:  4m 38s | Max: 13m 28s | Hits:  99%/5874  
      🟩 GCC                Pass: 100%/12  | Total:  1h 04m | Avg:  5m 23s | Max: 14m 43s | Hits:  98%/7048  
      🟩 MSVC               Pass: 100%/2   | Total: 21m 53s | Avg: 10m 56s | Max: 11m 02s | Hits:  93%/576   
      🟩 NVHPC              Pass: 100%/2   | Total: 16m 40s | Avg:  8m 20s | Max:  8m 43s | Hits:  96%/1170  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 18m 03s | Avg:  9m 01s | Max: 14m 43s | Hits:  99%/1174  
      🟩 rtx2080            Pass: 100%/24  | Total:  2h 11m | Avg:  5m 28s | Max: 13m 29s | Hits:  98%/13494 
    🟩 jobs
      🟩 Build              Pass: 100%/23  | Total:  1h 47m | Avg:  4m 41s | Max: 11m 02s | Hits:  98%/12907 
      🟩 Test               Pass: 100%/3   | Total: 41m 40s | Avg: 13m 53s | Max: 14m 43s | Hits:  99%/1761  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 21m 22s | Avg:  7m 07s | Max: 14m 43s | Hits:  98%/1761  
      🟩 90a                Pass: 100%/1   | Total:  3m 27s | Avg:  3m 27s | Max:  3m 27s | Hits:  98%/587   
    🟩 std
      🟩 17                 Pass: 100%/4   | Total: 18m 51s | Avg:  4m 42s | Max:  8m 43s | Hits:  98%/2346  
      🟩 20                 Pass: 100%/22  | Total:  2h 10m | Avg:  5m 56s | Max: 14m 43s | Hits:  98%/12322 
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
stdpar
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
stdpar
python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 26)

# Runner
17 linux-amd64-cpu16
4 linux-arm64-cpu16
2 windows-amd64-cpu16
2 linux-amd64-gpu-rtx2080-latest-1
1 linux-amd64-gpu-h100-latest-1

github-actions[bot] avatar Apr 25 '25 23:04 github-actions[bot]

🟩 CI finished in 22m 21s: Pass: 100%/26 | Total: 2h 06m | Avg: 4m 51s | Max: 11m 08s | Hits: 98%/14668
  • 🟩 cudax: Pass: 100%/26 | Total: 2h 06m | Avg: 4m 51s | Max: 11m 08s | Hits: 98%/14668

    🟩 cpu
      🟩 amd64              Pass: 100%/22  | Total:  1h 54m | Avg:  5m 11s | Max: 11m 08s | Hits:  98%/12320 
      🟩 arm64              Pass: 100%/4   | Total: 12m 10s | Avg:  3m 02s | Max:  3m 09s | Hits:  99%/2348  
    🟩 ctk
      🟩 12.0               Pass: 100%/3   | Total: 17m 31s | Avg:  5m 50s | Max: 11m 08s | Hits:  97%/1466  
      🟩 12.8               Pass: 100%/23  | Total:  1h 48m | Avg:  4m 44s | Max: 11m 00s | Hits:  99%/13202 
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/3   | Total: 17m 31s | Avg:  5m 50s | Max: 11m 08s | Hits:  97%/1466  
      🟩 nvcc12.8           Pass: 100%/23  | Total:  1h 48m | Avg:  4m 44s | Max: 11m 00s | Hits:  99%/13202 
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/26  | Total:  2h 06m | Avg:  4m 51s | Max: 11m 08s | Hits:  98%/14668 
    🟩 cxx
      🟩 Clang14            Pass: 100%/2   | Total:  6m 49s | Avg:  3m 24s | Max:  3m 41s | Hits: 100%/1178  
      🟩 Clang15            Pass: 100%/1   | Total:  3m 25s | Avg:  3m 25s | Max:  3m 25s | Hits: 100%/587   
      🟩 Clang16            Pass: 100%/1   | Total:  3m 42s | Avg:  3m 42s | Max:  3m 42s | Hits: 100%/587   
      🟩 Clang17            Pass: 100%/1   | Total:  3m 39s | Avg:  3m 39s | Max:  3m 39s | Hits: 100%/587   
      🟩 Clang18            Pass: 100%/1   | Total:  3m 25s | Avg:  3m 25s | Max:  3m 25s | Hits: 100%/587   
      🟩 Clang19            Pass: 100%/4   | Total: 17m 14s | Avg:  4m 18s | Max:  7m 55s | Hits: 100%/2348  
      🟩 GCC10              Pass: 100%/2   | Total:  6m 46s | Avg:  3m 23s | Max:  3m 31s | Hits:  99%/1178  
      🟩 GCC11              Pass: 100%/1   | Total:  3m 35s | Avg:  3m 35s | Max:  3m 35s | Hits:  99%/587   
      🟩 GCC12              Pass: 100%/1   | Total:  3m 28s | Avg:  3m 28s | Max:  3m 28s | Hits:  99%/587   
      🟩 GCC13              Pass: 100%/8   | Total: 35m 19s | Avg:  4m 24s | Max:  7m 57s | Hits:  99%/4696  
      🟩 MSVC14.39          Pass: 100%/1   | Total: 11m 08s | Avg: 11m 08s | Max: 11m 08s | Hits:  87%/288   
      🟩 MSVC14.42          Pass: 100%/1   | Total: 11m 00s | Avg: 11m 00s | Max: 11m 00s | Hits:  87%/288   
      🟩 NVHPC25.3          Pass: 100%/2   | Total: 16m 57s | Avg:  8m 28s | Max:  8m 43s | Hits:  93%/1170  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/10  | Total: 38m 14s | Avg:  3m 49s | Max:  7m 55s | Hits: 100%/5874  
      🟩 GCC                Pass: 100%/12  | Total: 49m 08s | Avg:  4m 05s | Max:  7m 57s | Hits:  99%/7048  
      🟩 MSVC               Pass: 100%/2   | Total: 22m 08s | Avg: 11m 04s | Max: 11m 08s | Hits:  87%/576   
      🟩 NVHPC              Pass: 100%/2   | Total: 16m 57s | Avg:  8m 28s | Max:  8m 43s | Hits:  93%/1170  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 10m 53s | Avg:  5m 26s | Max:  7m 37s | Hits:  99%/1174  
      🟩 rtx2080            Pass: 100%/24  | Total:  1h 55m | Avg:  4m 48s | Max: 11m 08s | Hits:  98%/13494 
    🟩 jobs
      🟩 Build              Pass: 100%/23  | Total:  1h 42m | Avg:  4m 28s | Max: 11m 08s | Hits:  98%/12907 
      🟩 Test               Pass: 100%/3   | Total: 23m 29s | Avg:  7m 49s | Max:  7m 57s | Hits:  99%/1761  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 14m 08s | Avg:  4m 42s | Max:  7m 37s | Hits:  99%/1761  
      🟩 90a                Pass: 100%/1   | Total:  3m 24s | Avg:  3m 24s | Max:  3m 24s | Hits:  99%/587   
    🟩 std
      🟩 17                 Pass: 100%/4   | Total: 17m 39s | Avg:  4m 24s | Max:  8m 14s | Hits:  98%/2346  
      🟩 20                 Pass: 100%/22  | Total:  1h 48m | Avg:  4m 56s | Max: 11m 08s | Hits:  98%/12322 
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
stdpar
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
stdpar
python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 26)

# Runner
17 linux-amd64-cpu16
4 linux-arm64-cpu16
2 windows-amd64-cpu16
2 linux-amd64-gpu-rtx2080-latest-1
1 linux-amd64-gpu-h100-latest-1

github-actions[bot] avatar Apr 28 '25 22:04 github-actions[bot]