rocPRIM
rocPRIM copied to clipboard
Develop stream 20220711
- Improve the configuration generation script, it now supports all benchmarks that have config autotuning.
- Added support for large indices in
device_reduce_by_key
,device_partition
, anddevice_unique
. - Improved block sort test by covering a wider range of input sizes.
- Small improvement to performance of device radix sort.
- Added workaround for benchmarks that showed poor performance due to a compiler problem.
- Added clarification for disfunctional method in block sort.
- Performance improvement for device merge sort with custom types.
- Improved the large indices test for device adjacent difference by verifying output locations.
- Removed workaround for
ROCM_SYMLINK_LIBS
. - Fixed bug in device merge sort that could result in incorrect results for structure types.