oneDPL icon indicating copy to clipboard operation
oneDPL copied to clipboard

std::sort with device_policy causes an exception on a large data

Open bagrorg opened this issue 3 years ago • 2 comments

Greetings! While using std::sort with oneapi::dpl::execution::device_policy, an exception occurs on a large amount of data. Sorting was performed on the CPU.

Also, when using the usual std::sort or tbb::parallel_sort, everything works without problems

Code:

#include <CL/sycl.hpp>
#include <oneapi/dpl/algorithm>
#include <oneapi/dpl/execution>
#include <oneapi/dpl/iterator>

#include <oneapi/tbb/parallel_sort.h>

#include <random>
#include <vector>

template <typename T>
std::vector<T> make_random(size_t size, size_t random_range_left = 1,
                           size_t random_range_right = 10000) {
  std::random_device rd;
  std::mt19937 gen(rd());
  std::vector<T> out(size);
  std::uniform_int_distribution<T> dist(random_range_left, random_range_right);
  std::generate(out.begin(), out.end(), [&]() { return dist(gen); });
  return out;
}

void sycl_sort(std::vector<int> &src) {
  size_t buf_size = src.size();
  auto sel = sycl::cpu_selector{};
  sycl::queue q{sel};
  auto dev_policy = oneapi::dpl::execution::device_policy{sel};

  sycl::buffer<int> buff_src(src.data(), sycl::range<1>{buf_size});

  std::sort(dev_policy, oneapi::dpl::begin(buff_src),
            oneapi::dpl::end(buff_src));
}

void tbb_sort(std::vector<int> &src) { tbb::parallel_sort(begin(src), end(src)); }

int main(int argc, char* argv[]) {
  size_t buf_size;
  std::cin >> buf_size;
  std::vector<int> src = make_random<int>(buf_size);
  std::cout << "Data generated" << std::endl;

  tbb_sort(src);
  std::cout << "TBB sort performed" << std::endl;

  sycl_sort(src);
  std::cout << "SYCL sort permormed" << std::endl;
}
$ ./sort_issue
1000000000
Data generated
TBB sort performed
terminate called after throwing an instance of 'cl::sycl::runtime_error'
  what():  Native API failed. Native API returns: -5 (CL_OUT_OF_RESOURCES) -5 (C                                                                             L_OUT_OF_RESOURCES)
Aborted (core dumped)

Environment:

  • OS: Linux
  • Device: Intel(R) Xeon(R) Platinum 8280L CPU @ 2.70GHz

bagrorg avatar Feb 02 '22 12:02 bagrorg

Hello, @bagrorg.

The issue has been fixed in #389. There have not been official releases containing the fix yet, but the most stable branch with the fix is release/2021.6. The suggestion is to use the code from this branch.

Alternatively, you can apply a workaround which does not require updating of oneDPL. Set CL_CONFIG_CPU_FORCE_PRIVATE_MEM_SIZE environment variable and specify sufficient amount of memory to be used by OpenCL runtime before running the program, e.g. CL_CONFIG_CPU_FORCE_PRIVATE_MEM_SIZE=1MB ./sort. You can find more information about the variable and its limitations here.

Meanwhile, looks like sycl_sort sorts the data which has already been sorted by tbb_sort.

dmitriy-sobolev avatar Feb 02 '22 14:02 dmitriy-sobolev

The fix has been included into https://github.com/oneapi-src/oneDPL/releases/tag/oneDPL-2021.6.1-release.

dmitriy-sobolev avatar Feb 07 '22 14:02 dmitriy-sobolev