oneDPL
oneDPL copied to clipboard
std::sort with device_policy causes an exception on a large data
Greetings!
While using std::sort
with oneapi::dpl::execution::device_policy
, an exception occurs on a large amount of data. Sorting was performed on the CPU.
Also, when using the usual std::sort
or tbb::parallel_sort
, everything works without problems
Code:
#include <CL/sycl.hpp>
#include <oneapi/dpl/algorithm>
#include <oneapi/dpl/execution>
#include <oneapi/dpl/iterator>
#include <oneapi/tbb/parallel_sort.h>
#include <random>
#include <vector>
template <typename T>
std::vector<T> make_random(size_t size, size_t random_range_left = 1,
size_t random_range_right = 10000) {
std::random_device rd;
std::mt19937 gen(rd());
std::vector<T> out(size);
std::uniform_int_distribution<T> dist(random_range_left, random_range_right);
std::generate(out.begin(), out.end(), [&]() { return dist(gen); });
return out;
}
void sycl_sort(std::vector<int> &src) {
size_t buf_size = src.size();
auto sel = sycl::cpu_selector{};
sycl::queue q{sel};
auto dev_policy = oneapi::dpl::execution::device_policy{sel};
sycl::buffer<int> buff_src(src.data(), sycl::range<1>{buf_size});
std::sort(dev_policy, oneapi::dpl::begin(buff_src),
oneapi::dpl::end(buff_src));
}
void tbb_sort(std::vector<int> &src) { tbb::parallel_sort(begin(src), end(src)); }
int main(int argc, char* argv[]) {
size_t buf_size;
std::cin >> buf_size;
std::vector<int> src = make_random<int>(buf_size);
std::cout << "Data generated" << std::endl;
tbb_sort(src);
std::cout << "TBB sort performed" << std::endl;
sycl_sort(src);
std::cout << "SYCL sort permormed" << std::endl;
}
$ ./sort_issue
1000000000
Data generated
TBB sort performed
terminate called after throwing an instance of 'cl::sycl::runtime_error'
what(): Native API failed. Native API returns: -5 (CL_OUT_OF_RESOURCES) -5 (C L_OUT_OF_RESOURCES)
Aborted (core dumped)
Environment:
- OS: Linux
- Device: Intel(R) Xeon(R) Platinum 8280L CPU @ 2.70GHz
Hello, @bagrorg.
The issue has been fixed in #389. There have not been official releases containing the fix yet, but the most stable branch with the fix is release/2021.6. The suggestion is to use the code from this branch.
Alternatively, you can apply a workaround which does not require updating of oneDPL. Set CL_CONFIG_CPU_FORCE_PRIVATE_MEM_SIZE
environment variable and specify sufficient amount of memory to be used by OpenCL runtime before running the program, e.g. CL_CONFIG_CPU_FORCE_PRIVATE_MEM_SIZE=1MB ./sort
. You can find more information about the variable and its limitations here.
Meanwhile, looks like sycl_sort
sorts the data which has already been sorted by tbb_sort
.
The fix has been included into https://github.com/oneapi-src/oneDPL/releases/tag/oneDPL-2021.6.1-release.