cccl icon indicating copy to clipboard operation
cccl copied to clipboard

[BUG]: minmax element test failing on Thrust's tbb.cuda configuration: values are not equal

Open alliepiper opened this issue 1 year ago • 1 comments

Is this a duplicate?

  • [x] I confirmed there appear to be no duplicate issues for this bug and that I agree to the Code of Conduct

Type of Bug

Runtime Error

Component

Thrust

Describe the bug

    Start 5507: thrust.tbb.cuda.cpp17.test.minmax_element
1/1 Test #5507: thrust.tbb.cuda.cpp17.test.minmax_element ...***Failed    0.69 sec
Testing Device 0: "NVIDIA GeForce RTX 4090"
Running 10 unit tests.
...F...F..
================================================================
FAILURE: TestMinMaxElementSimpleDevice
[/home/coder/cccl/thrust/testing/minmax_element.cu:17] values are not equal: custom_numeric{3} 1 [type='thrust::THRUST_200600_600_700_800_NS::device_reference<custom_numeric>']
================================================================
FAILURE: TestMinMaxElementWithTransformDevice
[/home/coder/cccl/thrust/testing/minmax_element.cu:37] values are not equal: custom_numeric{-3} -5 [type='custom_numeric']
================================================================
Totals: 2 failures, 0 known failures, 0 errors, and 8 passes.
Time:  0.0166667 minutes
CMake Error at /home/coder/cccl/thrust/cmake/ThrustRunTest.cmake:7 (message):
  
  /home/coder/cccl/build/cuda12.5-gcc13/all-dev/bin/thrust.tbb.cuda.cpp17.test.minmax_element
  failed (1)

How to Reproduce

Git bisect shows that the bug was introduced in 4634d8111f2973dbb48f35458ab38a7bdb13b6fb, but is easiest to repro on current main due to some other regressions that have been fixed recently.

cd cccl
cmake --preset all-dev
cmake --build --preset all-dev --target thrust.tbb.cuda.cpp17.test.minmax_element
ctest --preset all-dev --output-on-failure -R thrust.tbb.cuda.cpp17.test.minmax_element

Expected behavior

The test should pass.

Reproduction link

No response

Operating System

No response

nvidia-smi output

No response

NVCC version

No response

alliepiper avatar Sep 03 '24 16:09 alliepiper

I investigated those tests a bit.

It seems the device tests are failing, but they are only failing if we also run the host tests, which suggests there is something else going on

miscco avatar Sep 03 '24 18:09 miscco