cccl icon indicating copy to clipboard operation
cccl copied to clipboard

[FEA]: Improve NVHPC smoke tests

Open miscco opened this issue 1 month ago • 1 comments

Is this a duplicate?

  • [x] I confirmed there appear to be no duplicate issues for this request and that I agree to the Code of Conduct

Area

libcu++

Is your feature request related to a problem? Please describe.

NVHPC relies on our implementation for their stdpar backend.

We have one smoke test in https://github.com/NVIDIA/cccl/tree/main/test that serves as the sole backstop before we are breaking their internal CI

We want to improve the coverage spanning more algorithms. Note that compiling the tests is sufficient for now.

Describe the solution you'd like

Implement more small test cases that verify that we do not break NVHPC

  • [ ] adjacent_difference
  • [ ] exclusive_scan
  • [ ] inclusive_scan
  • [ ] transform_exclusive_scan
  • [ ] transform_inclusive_scan
  • [ ] transform_reduce
  • [ ] all_of
  • [ ] any_of
  • [ ] count
  • [ ] copy
  • [ ] copy_if
  • [ ] copy_n
  • [ ] count_if
  • [ ] equal
  • [ ] fill
  • [ ] fill_n
  • [ ] find
  • [ ] find_if
  • [ ] find_if_not
  • [ ] for_each
  • [ ] for_each_n
  • [ ] generate
  • [ ] generate_n
  • [ ] is_partitioned
  • [ ] is_sorted
  • [ ] is_sorted_until
  • [ ] max_element
  • [ ] min_element
  • [ ] minmax_element
  • [ ] merge
  • [ ] mismatch
  • [ ] none_of
  • [ ] partition
  • [ ] partition_copy
  • [ ] remove
  • [ ] remove_copy
  • [ ] remove_copy_if
  • [ ] remove_if
  • [ ] replace
  • [ ] replace_copy
  • [ ] replace_copy_if
  • [ ] replace_if
  • [ ] reverse
  • [ ] reverse_copy
  • [ ] rotate_copy
  • [ ] set_difference
  • [ ] set_intersection
  • [ ] set_symmetric_difference
  • [ ] set_union
  • [ ] sort
  • [ ] stable_partition
  • [ ] stable_sort
  • [ ] swap_ranges
  • [ ] transform
  • [ ] unique
  • [ ] unique_copy

Describe alternatives you've considered

No response

Additional context

The reduce test lives here

We should take this test as a sample and expand it to more accelerated algorithms

miscco avatar Nov 04 '25 19:11 miscco

These are the stdpar algorithms that are implemented directly through corresponding Thrust algorithms and can serve as an initial list of algorithms for the smoke tests:

adjacent_difference, exclusive_scan, inclusive_scan, transform_exclusive_scan, transform_inclusive_scan, transform_reduce, all_of, 
any_of, count, copy, copy_if, copy_n, count_if, equal, fill, fill_n, find, find_if, find_if_not, for_each, for_each_n, generate, generate_n, 
is_partitioned, is_sorted, is_sorted_until, max_element, min_element, minmax_element, merge, mismatch, none_of, partition, 
partition_copy, remove, remove_copy, remove_copy_if, remove_if, replace, replace_copy, replace_copy_if, replace_if, reverse, 
reverse_copy, rotate_copy, set_difference, set_intersection, set_symmetric_difference, set_union, sort, stable_partition, stable_sort, 
swap_ranges, transform, unique, unique_copy

Some of the other parallel stdpar algorithms are implemented in terms of the algorithms listed above, so I haven't included them separately because this list already covers them. I can also help with writing smoke tests for each of these algorithms.

zkhatami avatar Nov 10 '25 17:11 zkhatami

@miscco I like to work on this if possible.

viralbhadeshiya avatar Nov 15 '25 19:11 viralbhadeshiya

@viralbhadeshiya that would be great.

I would suggest to pick one algorithm at a time and look at the existing reduce test as well as the documentation of that algorithm. We want to test all overloads that take an execution policy.

miscco avatar Nov 17 '25 10:11 miscco