[FEA]: Improve NVHPC smoke tests
Is this a duplicate?
- [x] I confirmed there appear to be no duplicate issues for this request and that I agree to the Code of Conduct
Area
libcu++
Is your feature request related to a problem? Please describe.
NVHPC relies on our implementation for their stdpar backend.
We have one smoke test in https://github.com/NVIDIA/cccl/tree/main/test that serves as the sole backstop before we are breaking their internal CI
We want to improve the coverage spanning more algorithms. Note that compiling the tests is sufficient for now.
Describe the solution you'd like
Implement more small test cases that verify that we do not break NVHPC
- [ ] adjacent_difference
- [ ] exclusive_scan
- [ ] inclusive_scan
- [ ] transform_exclusive_scan
- [ ] transform_inclusive_scan
- [ ] transform_reduce
- [ ] all_of
- [ ] any_of
- [ ] count
- [ ] copy
- [ ] copy_if
- [ ] copy_n
- [ ] count_if
- [ ] equal
- [ ] fill
- [ ] fill_n
- [ ] find
- [ ] find_if
- [ ] find_if_not
- [ ] for_each
- [ ] for_each_n
- [ ] generate
- [ ] generate_n
- [ ] is_partitioned
- [ ] is_sorted
- [ ] is_sorted_until
- [ ] max_element
- [ ] min_element
- [ ] minmax_element
- [ ] merge
- [ ] mismatch
- [ ] none_of
- [ ] partition
- [ ] partition_copy
- [ ] remove
- [ ] remove_copy
- [ ] remove_copy_if
- [ ] remove_if
- [ ] replace
- [ ] replace_copy
- [ ] replace_copy_if
- [ ] replace_if
- [ ] reverse
- [ ] reverse_copy
- [ ] rotate_copy
- [ ] set_difference
- [ ] set_intersection
- [ ] set_symmetric_difference
- [ ] set_union
- [ ] sort
- [ ] stable_partition
- [ ] stable_sort
- [ ] swap_ranges
- [ ] transform
- [ ] unique
- [ ] unique_copy
Describe alternatives you've considered
No response
Additional context
The reduce test lives here
We should take this test as a sample and expand it to more accelerated algorithms
These are the stdpar algorithms that are implemented directly through corresponding Thrust algorithms and can serve as an initial list of algorithms for the smoke tests:
adjacent_difference, exclusive_scan, inclusive_scan, transform_exclusive_scan, transform_inclusive_scan, transform_reduce, all_of,
any_of, count, copy, copy_if, copy_n, count_if, equal, fill, fill_n, find, find_if, find_if_not, for_each, for_each_n, generate, generate_n,
is_partitioned, is_sorted, is_sorted_until, max_element, min_element, minmax_element, merge, mismatch, none_of, partition,
partition_copy, remove, remove_copy, remove_copy_if, remove_if, replace, replace_copy, replace_copy_if, replace_if, reverse,
reverse_copy, rotate_copy, set_difference, set_intersection, set_symmetric_difference, set_union, sort, stable_partition, stable_sort,
swap_ranges, transform, unique, unique_copy
Some of the other parallel stdpar algorithms are implemented in terms of the algorithms listed above, so I haven't included them separately because this list already covers them. I can also help with writing smoke tests for each of these algorithms.
@miscco I like to work on this if possible.
@viralbhadeshiya that would be great.
I would suggest to pick one algorithm at a time and look at the existing reduce test as well as the documentation of that algorithm. We want to test all overloads that take an execution policy.