cccl icon indicating copy to clipboard operation
cccl copied to clipboard

[Thrust]: New "sum rows" and "sum columns" examples

Open brycelelbach opened this issue 8 months ago • 15 comments

Look at this code, it's so beautiful.

Note how the data is randomly initialized device side in parallel using discard on the PRNG engine. Most of the other Thrust examples do this in serial.

brycelelbach avatar Apr 15 '25 19:04 brycelelbach

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

copy-pr-bot[bot] avatar Apr 15 '25 19:04 copy-pr-bot[bot]

needs rebase, otherwise LGTM!

gonidelis avatar Apr 15 '25 22:04 gonidelis

/ok to test 42d9b7c

jrhemstad avatar Apr 16 '25 14:04 jrhemstad

🟨 CI finished in 2h 00m: Pass: 54%/103 | Total: 2d 04h | Avg: 30m 20s | Max: 1h 17m | Hits: 91%/56863
  • 🟥 thrust: Pass: 0%/47 | Total: 11h 46m | Avg: 15m 02s | Max: 42m 05s

    🟥 cmake_options
      🟥 -DTHRUST_DISPATCH_TYPE=Force32bit Pass:   0%/2   | Total: 13m 35s | Avg:  6m 47s | Max: 13m 35s
    🟥 cpu
      🟥 amd64              Pass:   0%/45  | Total: 11h 19m | Avg: 15m 06s | Max: 42m 05s
      🟥 arm64              Pass:   0%/2   | Total: 27m 09s | Avg: 13m 34s | Max: 14m 01s
    🟥 ctk
      🟥 12.0               Pass:   0%/5   | Total:  1h 29m | Avg: 17m 57s | Max: 27m 07s
      🟥 12.8               Pass:   0%/42  | Total: 10h 16m | Avg: 14m 41s | Max: 42m 05s
    🟥 cudacxx
      🟥 ClangCUDA19        Pass:   0%/2   | Total: 29m 34s | Avg: 14m 47s | Max: 15m 06s
      🟥 nvcc12.0           Pass:   0%/5   | Total:  1h 29m | Avg: 17m 57s | Max: 27m 07s
      🟥 nvcc12.8           Pass:   0%/40  | Total:  9h 47m | Avg: 14m 41s | Max: 42m 05s
    🟥 cudacxx_family
      🟥 ClangCUDA          Pass:   0%/2   | Total: 29m 34s | Avg: 14m 47s | Max: 15m 06s
      🟥 nvcc               Pass:   0%/45  | Total: 11h 17m | Avg: 15m 02s | Max: 42m 05s
    🟥 cxx
      🟥 Clang14            Pass:   0%/4   | Total: 57m 07s | Avg: 14m 16s | Max: 15m 04s
      🟥 Clang15            Pass:   0%/2   | Total: 33m 18s | Avg: 16m 39s | Max: 17m 20s
      🟥 Clang16            Pass:   0%/2   | Total: 27m 49s | Avg: 13m 54s | Max: 14m 06s
      🟥 Clang17            Pass:   0%/2   | Total: 28m 43s | Avg: 14m 21s | Max: 15m 05s
      🟥 Clang18            Pass:   0%/2   | Total: 27m 50s | Avg: 13m 55s | Max: 14m 37s
      🟥 Clang19            Pass:   0%/7   | Total:  1h 11m | Avg: 10m 15s | Max: 15m 16s
      🟥 GCC7               Pass:   0%/2   | Total: 30m 29s | Avg: 15m 14s | Max: 15m 57s
      🟥 GCC8               Pass:   0%/1   | Total: 17m 08s | Avg: 17m 08s | Max: 17m 08s
      🟥 GCC9               Pass:   0%/2   | Total: 37m 15s | Avg: 18m 37s | Max: 19m 24s
      🟥 GCC10              Pass:   0%/2   | Total: 30m 22s | Avg: 15m 11s | Max: 16m 27s
      🟥 GCC11              Pass:   0%/2   | Total: 29m 40s | Avg: 14m 50s | Max: 15m 23s
      🟥 GCC12              Pass:   0%/2   | Total: 33m 06s | Avg: 16m 33s | Max: 17m 32s
      🟥 GCC13              Pass:   0%/10  | Total:  1h 19m | Avg:  7m 59s | Max: 16m 23s
      🟥 MSVC14.29          Pass:   0%/2   | Total: 54m 16s | Avg: 27m 08s | Max: 27m 09s
      🟥 MSVC14.42          Pass:   0%/3   | Total:  1h 10m | Avg: 23m 23s | Max: 35m 56s
      🟥 NVHPC25.3          Pass:   0%/2   | Total:  1h 17m | Avg: 38m 56s | Max: 42m 05s
    🟥 cxx_family
      🟥 Clang              Pass:   0%/19  | Total:  4h 06m | Avg: 12m 58s | Max: 17m 20s
      🟥 GCC                Pass:   0%/21  | Total:  4h 17m | Avg: 12m 16s | Max: 19m 24s
      🟥 MSVC               Pass:   0%/5   | Total:  2h 04m | Avg: 24m 53s | Max: 35m 56s
      🟥 NVHPC              Pass:   0%/2   | Total:  1h 17m | Avg: 38m 56s | Max: 42m 05s
    🟥 gpu
      🟥 h100               Pass:   0%/2   | Total:  7m 44s | Avg:  3m 52s | Max:  7m 44s
      🟥 rtx2080            Pass:   0%/35  | Total: 10h 17m | Avg: 17m 39s | Max: 42m 05s
      🟥 rtx4090            Pass:   0%/10  | Total:  1h 21m | Avg:  8m 07s | Max: 35m 56s
    🟥 jobs
      🟥 Build              Pass:   0%/40  | Total: 11h 46m | Avg: 17m 40s | Max: 42m 05s
      🟥 TestCPU            Pass:   0%/3  
      🟥 TestGPU            Pass:   0%/4  
    🟥 sm
      🟥 90                 Pass:   0%/2   | Total:  7m 44s | Avg:  3m 52s | Max:  7m 44s
      🟥 90;90a;100         Pass:   0%/1   | Total: 13m 41s | Avg: 13m 41s | Max: 13m 41s
    🟥 std
      🟥 17                 Pass:   0%/21  | Total:  6h 36m | Avg: 18m 53s | Max: 42m 05s
      🟥 20                 Pass:   0%/24  | Total:  4h 56m | Avg: 12m 21s | Max: 35m 56s
    
  • 🟩 cub: Pass: 100%/47 | Total: 1d 15h | Avg: 49m 54s | Max: 1h 17m | Hits: 91%/56545

    🟩 cpu
      🟩 amd64              Pass: 100%/45  | Total:  1d 13h | Avg: 49m 39s | Max:  1h 17m | Hits:  91%/54087 
      🟩 arm64              Pass: 100%/2   | Total:  1h 50m | Avg: 55m 27s | Max: 56m 16s | Hits:  90%/2458  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  4h 51m | Avg: 58m 12s | Max:  1h 05m | Hits:  85%/5974  
      🟩 12.8               Pass: 100%/42  | Total:  1d 10h | Avg: 48m 54s | Max:  1h 17m | Hits:  92%/50571 
    🟩 cudacxx
      🟩 ClangCUDA19        Pass: 100%/2   | Total:  1h 48m | Avg: 54m 25s | Max: 57m 04s | Hits:  91%/2120  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  4h 51m | Avg: 58m 12s | Max:  1h 05m | Hits:  85%/5974  
      🟩 nvcc12.8           Pass: 100%/40  | Total:  1d 08h | Avg: 48m 38s | Max:  1h 17m | Hits:  92%/48451 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 48m | Avg: 54m 25s | Max: 57m 04s | Hits:  91%/2120  
      🟩 nvcc               Pass: 100%/45  | Total:  1d 13h | Avg: 49m 42s | Max:  1h 17m | Hits:  91%/54425 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  3h 30m | Avg: 52m 34s | Max: 54m 35s | Hits:  90%/4924  
      🟩 Clang15            Pass: 100%/2   | Total:  1h 39m | Avg: 49m 46s | Max: 49m 54s | Hits:  90%/2458  
      🟩 Clang16            Pass: 100%/2   | Total:  1h 45m | Avg: 52m 39s | Max: 54m 58s | Hits:  90%/2458  
      🟩 Clang17            Pass: 100%/2   | Total:  1h 44m | Avg: 52m 11s | Max: 54m 14s | Hits:  90%/2458  
      🟩 Clang18            Pass: 100%/2   | Total:  1h 36m | Avg: 48m 07s | Max: 48m 29s | Hits:  90%/2458  
      🟩 Clang19            Pass: 100%/7   | Total:  5h 23m | Avg: 46m 10s | Max: 57m 04s | Hits:  93%/8265  
      🟩 GCC7               Pass: 100%/2   | Total:  1h 51m | Avg: 55m 56s | Max: 57m 21s | Hits:  90%/2462  
      🟩 GCC8               Pass: 100%/1   | Total: 59m 07s | Avg: 59m 07s | Max: 59m 07s | Hits:  90%/1231  
      🟩 GCC9               Pass: 100%/2   | Total:  1h 55m | Avg: 57m 49s | Max:  1h 05m | Hits:  78%/2462  
      🟩 GCC10              Pass: 100%/2   | Total:  1h 46m | Avg: 53m 18s | Max: 56m 54s | Hits:  90%/2462  
      🟩 GCC11              Pass: 100%/2   | Total:  1h 42m | Avg: 51m 06s | Max: 53m 13s | Hits:  90%/2458  
      🟩 GCC12              Pass: 100%/2   | Total:  1h 49m | Avg: 54m 52s | Max: 58m 09s | Hits:  90%/2458  
      🟩 GCC13              Pass: 100%/11  | Total:  6h 31m | Avg: 35m 34s | Max: 59m 55s | Hits:  95%/13519 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 09m | Avg:  1h 04m | Max:  1h 04m | Hits:  91%/2100  
      🟩 MSVC14.42          Pass: 100%/2   | Total:  2h 27m | Avg:  1h 13m | Max:  1h 17m | Hits:  91%/2100  
      🟩 NVHPC25.3          Pass: 100%/2   | Total:  2h 12m | Avg:  1h 06m | Max:  1h 07m | Hits:  89%/2272  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total: 15h 38m | Avg: 49m 25s | Max: 57m 04s | Hits:  91%/23021 
      🟩 GCC                Pass: 100%/22  | Total: 16h 36m | Avg: 45m 17s | Max:  1h 05m | Hits:  91%/27052 
      🟩 MSVC               Pass: 100%/4   | Total:  4h 37m | Avg:  1h 09m | Max:  1h 17m | Hits:  91%/4200  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 12m | Avg:  1h 06m | Max:  1h 07m | Hits:  89%/2272  
    🟩 gpu
      🟩 h100               Pass: 100%/3   | Total:  1h 09m | Avg: 23m 17s | Max: 27m 37s | Hits:  96%/3687  
      🟩 rtx2080            Pass: 100%/36  | Total:  1d 09h | Avg: 56m 13s | Max:  1h 17m | Hits:  89%/43026 
      🟩 rtxa6000           Pass: 100%/8   | Total:  4h 11m | Avg: 31m 27s | Max: 52m 40s | Hits:  97%/9832  
    🟩 jobs
      🟩 Build              Pass: 100%/39  | Total:  1d 11h | Avg: 55m 01s | Max:  1h 17m | Hits:  89%/46713 
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 28m 32s | Avg: 28m 32s | Max: 28m 32s | Hits:  99%/1229  
      🟩 GraphCapture       Pass: 100%/1   | Total: 21m 09s | Avg: 21m 09s | Max: 21m 09s | Hits:  99%/1229  
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 21m | Avg: 27m 04s | Max: 27m 45s | Hits:  99%/3687  
      🟩 TestGPU            Pass: 100%/3   | Total:  1h 08m | Avg: 22m 49s | Max: 24m 20s | Hits:  99%/3687  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total:  1h 09m | Avg: 23m 17s | Max: 27m 37s | Hits:  96%/3687  
      🟩 90;90a;100         Pass: 100%/1   | Total: 59m 55s | Avg: 59m 55s | Max: 59m 55s | Hits:  90%/1229  
    🟩 std
      🟩 17                 Pass: 100%/21  | Total: 19h 46m | Avg: 56m 29s | Max:  1h 10m | Hits:  89%/25026 
      🟩 20                 Pass: 100%/26  | Total: 19h 18m | Avg: 44m 34s | Max:  1h 17m | Hits:  93%/31519 
    
  • 🟩 stdpar: Pass: 100%/4 | Total: 19m 24s | Avg: 4m 51s | Max: 5m 58s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 11m 18s | Avg:  5m 39s | Max:  5m 58s
      🟩 arm64              Pass: 100%/2   | Total:  8m 06s | Avg:  4m 03s | Max:  4m 13s
    🟩 ctk
      🟩 12.8               Pass: 100%/4   | Total: 19m 24s | Avg:  4m 51s | Max:  5m 58s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/4   | Total: 19m 24s | Avg:  4m 51s | Max:  5m 58s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/4   | Total: 19m 24s | Avg:  4m 51s | Max:  5m 58s
    🟩 cxx
      🟩 NVHPC25.3          Pass: 100%/4   | Total: 19m 24s | Avg:  4m 51s | Max:  5m 58s
    🟩 cxx_family
      🟩 NVHPC              Pass: 100%/4   | Total: 19m 24s | Avg:  4m 51s | Max:  5m 58s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/4   | Total: 19m 24s | Avg:  4m 51s | Max:  5m 58s
    🟩 jobs
      🟩 Build              Pass: 100%/4   | Total: 19m 24s | Avg:  4m 51s | Max:  5m 58s
    🟩 std
      🟩 17                 Pass: 100%/2   | Total:  9m 13s | Avg:  4m 36s | Max:  5m 20s
      🟩 20                 Pass: 100%/2   | Total: 10m 11s | Avg:  5m 05s | Max:  5m 58s
    
  • 🟩 python: Pass: 100%/3 | Total: 27m 56s | Avg: 9m 18s | Max: 19m 08s

    🟩 cpu
      🟩 amd64              Pass: 100%/3   | Total: 27m 56s | Avg:  9m 18s | Max: 19m 08s
    🟩 ctk
      🟩 12.8               Pass: 100%/3   | Total: 27m 56s | Avg:  9m 18s | Max: 19m 08s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/3   | Total: 27m 56s | Avg:  9m 18s | Max: 19m 08s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/3   | Total: 27m 56s | Avg:  9m 18s | Max: 19m 08s
    🟩 cxx
      🟩 GCC13              Pass: 100%/3   | Total: 27m 56s | Avg:  9m 18s | Max: 19m 08s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/3   | Total: 27m 56s | Avg:  9m 18s | Max: 19m 08s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/3   | Total: 27m 56s | Avg:  9m 18s | Max: 19m 08s
    🟩 jobs
      🟩 cuda.cccl          Pass: 100%/1   | Total:  2m 51s | Avg:  2m 51s | Max:  2m 51s
      🟩 cuda.cooperative   Pass: 100%/1   | Total: 19m 08s | Avg: 19m 08s | Max: 19m 08s
      🟩 cuda.parallel      Pass: 100%/1   | Total:  5m 57s | Avg:  5m 57s | Max:  5m 57s
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 25m 09s | Avg: 12m 34s | Max: 22m 43s | Hits: 98%/318

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 25m 09s | Avg: 12m 34s | Max: 22m 43s | Hits:  98%/318   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 25m 09s | Avg: 12m 34s | Max: 22m 43s | Hits:  98%/318   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 25m 09s | Avg: 12m 34s | Max: 22m 43s | Hits:  98%/318   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 25m 09s | Avg: 12m 34s | Max: 22m 43s | Hits:  98%/318   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 25m 09s | Avg: 12m 34s | Max: 22m 43s | Hits:  98%/318   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 25m 09s | Avg: 12m 34s | Max: 22m 43s | Hits:  98%/318   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 25m 09s | Avg: 12m 34s | Max: 22m 43s | Hits:  98%/318   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 26s | Avg:  2m 26s | Max:  2m 26s | Hits:  98%/159   
      🟩 Test               Pass: 100%/1   | Total: 22m 43s | Avg: 22m 43s | Max: 22m 43s | Hits:  98%/159   
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
+/- Thrust
CUDA Experimental
stdpar
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- stdpar
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 103)

# Runner
72 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-arm64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
4 linux-amd64-gpu-rtx2080-latest-1
3 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1

github-actions[bot] avatar Apr 16 '25 16:04 github-actions[bot]

/ok to test

brycelelbach avatar Apr 16 '25 16:04 brycelelbach

/ok to test

@brycelelbach, there was an error processing your request: E1

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/1/

copy-pr-bot[bot] avatar Apr 16 '25 16:04 copy-pr-bot[bot]

/ok to test e3b8467

brycelelbach avatar Apr 16 '25 16:04 brycelelbach

/ok to test c2d3717

brycelelbach avatar Apr 16 '25 16:04 brycelelbach

/ok to test 92d906d

brycelelbach avatar Apr 16 '25 16:04 brycelelbach

/ok to test e5dd47f

brycelelbach avatar Apr 16 '25 17:04 brycelelbach

/ok to test e5dd47f

brycelelbach avatar Apr 16 '25 17:04 brycelelbach

/ok to test e5dd47f

@brycelelbach, there was an error processing your request: E2

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/2/

copy-pr-bot[bot] avatar Apr 16 '25 17:04 copy-pr-bot[bot]

/ok to test fc9dade

brycelelbach avatar Apr 16 '25 17:04 brycelelbach

/ok to test 0a4897b

brycelelbach avatar Apr 16 '25 17:04 brycelelbach

🟩 CI finished in 2h 13m: Pass: 100%/103 | Total: 17h 19m | Avg: 10m 05s | Max: 32m 29s | Hits: 98%/140467
  • 🟩 cub: Pass: 100%/47 | Total: 8h 42m | Avg: 11m 07s | Max: 30m 18s | Hits: 99%/56545

    🟩 cpu
      🟩 amd64              Pass: 100%/45  | Total:  8h 30m | Avg: 11m 21s | Max: 30m 18s | Hits:  99%/54087 
      🟩 arm64              Pass: 100%/2   | Total: 12m 03s | Avg:  6m 01s | Max:  6m 20s | Hits:  99%/2458  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total: 44m 27s | Avg:  8m 53s | Max: 19m 03s | Hits:  99%/5974  
      🟩 12.8               Pass: 100%/42  | Total:  7h 58m | Avg: 11m 23s | Max: 30m 18s | Hits:  99%/50571 
    🟩 cudacxx
      🟩 ClangCUDA19        Pass: 100%/2   | Total: 10m 06s | Avg:  5m 03s | Max:  5m 04s | Hits: 100%/2120  
      🟩 nvcc12.0           Pass: 100%/5   | Total: 44m 27s | Avg:  8m 53s | Max: 19m 03s | Hits:  99%/5974  
      🟩 nvcc12.8           Pass: 100%/40  | Total:  7h 48m | Avg: 11m 42s | Max: 30m 18s | Hits:  99%/48451 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 10m 06s | Avg:  5m 03s | Max:  5m 04s | Hits: 100%/2120  
      🟩 nvcc               Pass: 100%/45  | Total:  8h 32m | Avg: 11m 23s | Max: 30m 18s | Hits:  99%/54425 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 25m 04s | Avg:  6m 16s | Max:  7m 01s | Hits: 100%/4924  
      🟩 Clang15            Pass: 100%/2   | Total: 13m 12s | Avg:  6m 36s | Max:  6m 52s | Hits: 100%/2458  
      🟩 Clang16            Pass: 100%/2   | Total: 13m 22s | Avg:  6m 41s | Max:  6m 41s | Hits: 100%/2458  
      🟩 Clang17            Pass: 100%/2   | Total: 13m 42s | Avg:  6m 51s | Max:  7m 03s | Hits: 100%/2458  
      🟩 Clang18            Pass: 100%/2   | Total: 12m 50s | Avg:  6m 25s | Max:  6m 34s | Hits: 100%/2458  
      🟩 Clang19            Pass: 100%/7   | Total:  1h 21m | Avg: 11m 36s | Max: 28m 31s | Hits: 100%/8265  
      🟩 GCC7               Pass: 100%/2   | Total: 12m 49s | Avg:  6m 24s | Max:  6m 32s | Hits:  99%/2462  
      🟩 GCC8               Pass: 100%/1   | Total:  6m 45s | Avg:  6m 45s | Max:  6m 45s | Hits:  99%/1231  
      🟩 GCC9               Pass: 100%/2   | Total: 13m 16s | Avg:  6m 38s | Max:  6m 41s | Hits:  99%/2462  
      🟩 GCC10              Pass: 100%/2   | Total: 13m 44s | Avg:  6m 52s | Max:  7m 06s | Hits:  99%/2462  
      🟩 GCC11              Pass: 100%/2   | Total: 13m 47s | Avg:  6m 53s | Max:  6m 54s | Hits:  99%/2458  
      🟩 GCC12              Pass: 100%/2   | Total: 14m 04s | Avg:  7m 02s | Max:  7m 05s | Hits:  99%/2458  
      🟩 GCC13              Pass: 100%/11  | Total:  3h 06m | Avg: 16m 58s | Max: 30m 18s | Hits:  99%/13519 
      🟩 MSVC14.29          Pass: 100%/2   | Total: 38m 23s | Avg: 19m 11s | Max: 19m 20s | Hits:  99%/2100  
      🟩 MSVC14.42          Pass: 100%/2   | Total: 40m 05s | Avg: 20m 02s | Max: 21m 04s | Hits:  99%/2100  
      🟩 NVHPC25.3          Pass: 100%/2   | Total: 23m 57s | Avg: 11m 58s | Max: 12m 04s | Hits:  98%/2272  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total:  2h 39m | Avg:  8m 23s | Max: 28m 31s | Hits: 100%/23021 
      🟩 GCC                Pass: 100%/22  | Total:  4h 21m | Avg: 11m 52s | Max: 30m 18s | Hits:  99%/27052 
      🟩 MSVC               Pass: 100%/4   | Total:  1h 18m | Avg: 19m 37s | Max: 21m 04s | Hits:  99%/4200  
      🟩 NVHPC              Pass: 100%/2   | Total: 23m 57s | Avg: 11m 58s | Max: 12m 04s | Hits:  98%/2272  
    🟩 gpu
      🟩 h100               Pass: 100%/3   | Total: 54m 09s | Avg: 18m 03s | Max: 26m 43s | Hits:  99%/3687  
      🟩 rtx2080            Pass: 100%/36  | Total:  4h 59m | Avg:  8m 19s | Max: 21m 04s | Hits:  99%/43026 
      🟩 rtxa6000           Pass: 100%/8   | Total:  2h 49m | Avg: 21m 08s | Max: 30m 18s | Hits:  99%/9832  
    🟩 jobs
      🟩 Build              Pass: 100%/39  | Total:  5h 18m | Avg:  8m 10s | Max: 21m 04s | Hits:  99%/46713 
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 26m 17s | Avg: 26m 17s | Max: 26m 17s | Hits:  99%/1229  
      🟩 GraphCapture       Pass: 100%/1   | Total: 23m 54s | Avg: 23m 54s | Max: 23m 54s | Hits:  99%/1229  
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 25m | Avg: 28m 30s | Max: 30m 18s | Hits:  99%/3687  
      🟩 TestGPU            Pass: 100%/3   | Total:  1h 08m | Avg: 22m 49s | Max: 23m 44s | Hits:  99%/3687  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 54m 09s | Avg: 18m 03s | Max: 26m 43s | Hits:  99%/3687  
      🟩 90;90a;100         Pass: 100%/1   | Total:  7m 43s | Avg:  7m 43s | Max:  7m 43s | Hits:  99%/1229  
    🟩 std
      🟩 17                 Pass: 100%/21  | Total:  3h 02m | Avg:  8m 41s | Max: 19m 20s | Hits:  99%/25026 
      🟩 20                 Pass: 100%/26  | Total:  5h 40m | Avg: 13m 05s | Max: 30m 18s | Hits:  99%/31519 
    
  • 🟩 thrust: Pass: 100%/47 | Total: 7h 26m | Avg: 9m 29s | Max: 32m 29s | Hits: 98%/83604

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 18m 32s | Avg:  9m 16s | Max: 11m 37s | Hits:  99%/3560  
    🟩 cpu
      🟩 amd64              Pass: 100%/45  | Total:  7h 15m | Avg:  9m 40s | Max: 32m 29s | Hits:  98%/80045 
      🟩 arm64              Pass: 100%/2   | Total: 10m 45s | Avg:  5m 22s | Max:  5m 38s | Hits:  99%/3559  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total: 40m 13s | Avg:  8m 02s | Max: 19m 31s | Hits:  99%/8891  
      🟩 12.8               Pass: 100%/42  | Total:  6h 46m | Avg:  9m 40s | Max: 32m 29s | Hits:  98%/74713 
    🟩 cudacxx
      🟩 ClangCUDA19        Pass: 100%/2   | Total: 11m 39s | Avg:  5m 49s | Max:  6m 10s | Hits:  99%/3558  
      🟩 nvcc12.0           Pass: 100%/5   | Total: 40m 13s | Avg:  8m 02s | Max: 19m 31s | Hits:  99%/8891  
      🟩 nvcc12.8           Pass: 100%/40  | Total:  6h 34m | Avg:  9m 51s | Max: 32m 29s | Hits:  98%/71155 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 11m 39s | Avg:  5m 49s | Max:  6m 10s | Hits:  99%/3558  
      🟩 nvcc               Pass: 100%/45  | Total:  7h 14m | Avg:  9m 39s | Max: 32m 29s | Hits:  98%/80046 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 21m 26s | Avg:  5m 21s | Max:  5m 59s | Hits:  99%/7116  
      🟩 Clang15            Pass: 100%/2   | Total: 12m 05s | Avg:  6m 02s | Max:  6m 08s | Hits:  99%/3558  
      🟩 Clang16            Pass: 100%/2   | Total: 11m 15s | Avg:  5m 37s | Max:  5m 38s | Hits:  99%/3558  
      🟩 Clang17            Pass: 100%/2   | Total: 11m 14s | Avg:  5m 37s | Max:  5m 40s | Hits:  99%/3558  
      🟩 Clang18            Pass: 100%/2   | Total: 11m 15s | Avg:  5m 37s | Max:  5m 40s | Hits:  99%/3558  
      🟩 Clang19            Pass: 100%/7   | Total: 46m 21s | Avg:  6m 37s | Max:  9m 55s | Hits:  99%/12453 
      🟩 GCC7               Pass: 100%/2   | Total: 12m 05s | Avg:  6m 02s | Max:  6m 56s | Hits:  99%/3560  
      🟩 GCC8               Pass: 100%/1   | Total:  5m 32s | Avg:  5m 32s | Max:  5m 32s | Hits:  99%/1780  
      🟩 GCC9               Pass: 100%/2   | Total: 11m 45s | Avg:  5m 52s | Max:  5m 53s | Hits:  99%/3560  
      🟩 GCC10              Pass: 100%/2   | Total: 11m 59s | Avg:  5m 59s | Max:  6m 15s | Hits:  99%/3560  
      🟩 GCC11              Pass: 100%/2   | Total: 12m 20s | Avg:  6m 10s | Max:  6m 11s | Hits:  99%/3560  
      🟩 GCC12              Pass: 100%/2   | Total: 13m 25s | Avg:  6m 42s | Max:  6m 43s | Hits:  99%/3560  
      🟩 GCC13              Pass: 100%/10  | Total:  1h 40m | Avg: 10m 03s | Max: 32m 29s | Hits:  94%/17800 
      🟩 MSVC14.29          Pass: 100%/2   | Total: 41m 03s | Avg: 20m 31s | Max: 21m 32s | Hits:  99%/3546  
      🟩 MSVC14.42          Pass: 100%/3   | Total:  1h 11m | Avg: 23m 58s | Max: 27m 30s | Hits:  99%/5319  
      🟩 NVHPC25.3          Pass: 100%/2   | Total: 52m 06s | Avg: 26m 03s | Max: 27m 08s | Hits:  98%/3558  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total:  1h 53m | Avg:  5m 58s | Max:  9m 55s | Hits:  99%/33801 
      🟩 GCC                Pass: 100%/21  | Total:  2h 47m | Avg:  7m 59s | Max: 32m 29s | Hits:  97%/37380 
      🟩 MSVC               Pass: 100%/5   | Total:  1h 52m | Avg: 22m 35s | Max: 27m 30s | Hits:  99%/8865  
      🟩 NVHPC              Pass: 100%/2   | Total: 52m 06s | Avg: 26m 03s | Max: 27m 08s | Hits:  98%/3558  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 37m 28s | Avg: 18m 44s | Max: 32m 29s | Hits:  74%/3560  
      🟩 rtx2080            Pass: 100%/35  | Total:  4h 49m | Avg:  8m 16s | Max: 27m 08s | Hits:  99%/62261 
      🟩 rtx4090            Pass: 100%/10  | Total:  1h 59m | Avg: 11m 56s | Max: 27m 30s | Hits:  99%/17783 
    🟩 jobs
      🟩 Build              Pass: 100%/40  | Total:  5h 37m | Avg:  8m 26s | Max: 27m 08s | Hits:  99%/71153 
      🟩 TestCPU            Pass: 100%/3   | Total: 43m 44s | Avg: 14m 34s | Max: 27m 30s | Hits:  99%/5332  
      🟩 TestGPU            Pass: 100%/4   | Total:  1h 04m | Avg: 16m 11s | Max: 32m 29s | Hits:  87%/7119  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 37m 28s | Avg: 18m 44s | Max: 32m 29s | Hits:  74%/3560  
      🟩 90;90a;100         Pass: 100%/1   | Total:  7m 10s | Avg:  7m 10s | Max:  7m 10s | Hits:  99%/1780  
    🟩 std
      🟩 17                 Pass: 100%/21  | Total:  3h 07m | Avg:  8m 56s | Max: 27m 08s | Hits:  99%/37350 
      🟩 20                 Pass: 100%/24  | Total:  4h 00m | Avg: 10m 00s | Max: 32m 29s | Hits:  97%/42694 
    
  • 🟩 stdpar: Pass: 100%/4 | Total: 18m 38s | Avg: 4m 39s | Max: 5m 29s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 10m 49s | Avg:  5m 24s | Max:  5m 29s
      🟩 arm64              Pass: 100%/2   | Total:  7m 49s | Avg:  3m 54s | Max:  3m 56s
    🟩 ctk
      🟩 12.8               Pass: 100%/4   | Total: 18m 38s | Avg:  4m 39s | Max:  5m 29s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/4   | Total: 18m 38s | Avg:  4m 39s | Max:  5m 29s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/4   | Total: 18m 38s | Avg:  4m 39s | Max:  5m 29s
    🟩 cxx
      🟩 NVHPC25.3          Pass: 100%/4   | Total: 18m 38s | Avg:  4m 39s | Max:  5m 29s
    🟩 cxx_family
      🟩 NVHPC              Pass: 100%/4   | Total: 18m 38s | Avg:  4m 39s | Max:  5m 29s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/4   | Total: 18m 38s | Avg:  4m 39s | Max:  5m 29s
    🟩 jobs
      🟩 Build              Pass: 100%/4   | Total: 18m 38s | Avg:  4m 39s | Max:  5m 29s
    🟩 std
      🟩 17                 Pass: 100%/2   | Total:  9m 16s | Avg:  4m 38s | Max:  5m 20s
      🟩 20                 Pass: 100%/2   | Total:  9m 22s | Avg:  4m 41s | Max:  5m 29s
    
  • 🟩 python: Pass: 100%/3 | Total: 25m 54s | Avg: 8m 38s | Max: 17m 39s

    🟩 cpu
      🟩 amd64              Pass: 100%/3   | Total: 25m 54s | Avg:  8m 38s | Max: 17m 39s
    🟩 ctk
      🟩 12.8               Pass: 100%/3   | Total: 25m 54s | Avg:  8m 38s | Max: 17m 39s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/3   | Total: 25m 54s | Avg:  8m 38s | Max: 17m 39s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/3   | Total: 25m 54s | Avg:  8m 38s | Max: 17m 39s
    🟩 cxx
      🟩 GCC13              Pass: 100%/3   | Total: 25m 54s | Avg:  8m 38s | Max: 17m 39s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/3   | Total: 25m 54s | Avg:  8m 38s | Max: 17m 39s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/3   | Total: 25m 54s | Avg:  8m 38s | Max: 17m 39s
    🟩 jobs
      🟩 cuda.cccl          Pass: 100%/1   | Total:  2m 49s | Avg:  2m 49s | Max:  2m 49s
      🟩 cuda.cooperative   Pass: 100%/1   | Total: 17m 39s | Avg: 17m 39s | Max: 17m 39s
      🟩 cuda.parallel      Pass: 100%/1   | Total:  5m 26s | Avg:  5m 26s | Max:  5m 26s
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 25m 36s | Avg: 12m 48s | Max: 23m 13s | Hits: 98%/318

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 25m 36s | Avg: 12m 48s | Max: 23m 13s | Hits:  98%/318   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 25m 36s | Avg: 12m 48s | Max: 23m 13s | Hits:  98%/318   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 25m 36s | Avg: 12m 48s | Max: 23m 13s | Hits:  98%/318   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 25m 36s | Avg: 12m 48s | Max: 23m 13s | Hits:  98%/318   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 25m 36s | Avg: 12m 48s | Max: 23m 13s | Hits:  98%/318   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 25m 36s | Avg: 12m 48s | Max: 23m 13s | Hits:  98%/318   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 25m 36s | Avg: 12m 48s | Max: 23m 13s | Hits:  98%/318   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 23s | Avg:  2m 23s | Max:  2m 23s | Hits:  98%/159   
      🟩 Test               Pass: 100%/1   | Total: 23m 13s | Avg: 23m 13s | Max: 23m 13s | Hits:  98%/159   
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
+/- Thrust
CUDA Experimental
stdpar
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- stdpar
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 103)

# Runner
72 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-arm64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
4 linux-amd64-gpu-rtx2080-latest-1
3 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1

github-actions[bot] avatar Apr 16 '25 20:04 github-actions[bot]

@miscco please take another look!

brycelelbach avatar Nov 04 '25 00:11 brycelelbach

🥳 CI Workflow Results

🟩 Finished in 1h 07m: Pass: 100%/70 | Total: 9h 10m | Max: 1h 05m | Hits: 99%/115418

See results here.

github-actions[bot] avatar Nov 04 '25 01:11 github-actions[bot]

@brycelelbach you might want to consider adding a Signing Key so runners fire automatically on your push.

gonidelis avatar Nov 05 '25 17:11 gonidelis

/ok to test efcce67

brycelelbach avatar Nov 10 '25 17:11 brycelelbach

🥳 CI Workflow Results

🟩 Finished in 1h 10m: Pass: 100%/70 | Total: 11h 09m | Max: 1h 05m | Hits: 99%/115426

See results here.

github-actions[bot] avatar Nov 10 '25 20:11 github-actions[bot]

🥳 CI Workflow Results

🟩 Finished in 46m 53s: Pass: 100%/70 | Total: 6h 49m | Max: 28m 44s | Hits: 99%/115426

See results here.

github-actions[bot] avatar Nov 10 '25 23:11 github-actions[bot]