thrust icon indicating copy to clipboard operation
thrust copied to clipboard

reduce_by_key results are non-deterministic for floats

Open lilohuang opened this issue 2 years ago • 1 comments

This is a follow up question from https://github.com/NVIDIA/thrust/issues/1587

@allisonvacanti @senior-zero

Unlike the thrust::reduce(), the thrust::reduce_by_key() results are also non-deterministic for floats, am I right?

From my limited testing somehow I got run-to-run result from the below code with CUDA 11.6 SDK. BTW, you need to run the program multiple times and will occasionally see the error. I also tested it with cub::DeviceReduce::ReduceByKey (with 11.6 SDK), and it doesn't work like the documentation mentioned which provides "run-to-run" determinism.

Note: this issue is mainly for thrust, there is a similar issue for cub (https://github.com/NVIDIA/cub/issues/441)

#include <thrust/host_vector.h>
#include <thrust/device_vector.h>

#include <thrust/copy.h>
#include <thrust/fill.h>
#include <thrust/sequence.h>
#include <thrust/reduce.h>
#include <iostream>

int main() {
  auto const numElements = 250000;
  thrust::device_vector<double> data(numElements, 0.1);
  thrust::device_vector<double> keys(numElements, 1);

  thrust::device_vector<double> keys_out1(numElements);
  thrust::device_vector<double> keys_out2(numElements);

  thrust::device_vector<double> out1(numElements);
  thrust::device_vector<double> out2(numElements);

  thrust::host_vector<double> hostOut1(numElements);
  thrust::host_vector<double> hostOut2(numElements);

  size_t num1 = thrust::distance(keys_out1.begin(),
    thrust::reduce_by_key(keys.begin(), keys.end(), data.begin(), keys_out1.begin(), out1.begin()).first);
  size_t num2 = thrust::distance(keys_out2.begin(),
    thrust::reduce_by_key(keys.begin(), keys.end(), data.begin(), keys_out2.begin(), out2.begin()).first);
  assert(num1 == num2);
  std::cout << "num = " << num1 << std::endl;

  // copy all of out1 and out2 to the host
  thrust::copy(out1.begin(), out1.begin()+num1, hostOut1.begin());
  thrust::copy(out2.begin(), out2.begin()+num2, hostOut2.begin());

  // Check the outputs are exactly the same
  for(int i = 0; i < num1; i++) {
    if (hostOut1[i] != hostOut2[i]) {
      std::cout << "Element "<< i << " is not equal" << std::endl;
    }
  }

  return 0;
}

lilohuang avatar Feb 25 '22 02:02 lilohuang

It looks like we'll need to update the docs here, similar to NVIDIA/thrust#1587. Thanks for pointing this out!

alliepiper avatar Mar 07 '22 19:03 alliepiper