Issues with `min()` and `argmin()` with `cuda` backend
Version of Awkward Array
2.8.3
Description and code to reproduce
When trying to use ak.min() and ak.argmin() with the cuda backend, I am seeing incorrect behavior for ak.min(), and and a CuPyKernel not found error for ak.argmin().
This code reproduces the issues:
import awkward as ak
myarr = [[1.1,2.1],[1.2,2.2,-3.2]]
myarr_cpu = ak.Array(myarr,backend="cpu")
myarr_gpu = ak.Array(myarr,backend="cuda")
print("myarr:",myarr)
print("min (cpu):", ak.min(myarr_cpu,axis=1))
print("min (gpu):", ak.min(myarr_gpu,axis=1))
print("min (cpu):", ak.argmin(myarr_cpu,axis=1))
print("min (gpu):", ak.argmin(myarr_gpu,axis=1))
If things were working properly, I would expect to see this printed:
myarr: [[1.1, 2.1], [1.2, 2.2, -3.2]]
min (cpu): [1.1, -3.2]
min (gpu): [1.1, -3.2]
min (cpu): [0, 2]
min (gpu): [0, 2]
However, for the ak.min() with GPU I am getting [1.1, inf] (just a note in case it is useful: from some preliminary investigation it looks like this issue may be potentially specific to cases that include a value that is negative).
And, for the ak.argmin with GPU I am getting AssertionError: CuPyKernel not found: ('awkward_reduce_argmin', <class 'numpy.int64'>, <class 'numpy.float64'>, <class 'numpy.int64'>).
The full output is shown here:
myarr: [[1.1, 2.1], [1.2, 2.2, -3.2]]
min (cpu): [1.1, -3.2]
min (gpu): [1.1, inf]
min (cpu): [0, 2]
Traceback (most recent call last):
File "/home/k.mohrman/coffea_dir/gpu_studies/columnar_gpu/mre_argmin.py", line 12, in <module>
print("min (gpu):", ak.argmin(myarr_gpu,axis=1))
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/blue/p.chang/k.mohrman/dir_for_miniconda/miniconda3/envs/coffeagpu_env5/lib/python3.12/site-packages/awkward/_dispatch.py", line 41, in dispatch
with OperationErrorContext(name, args, kwargs):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/blue/p.chang/k.mohrman/dir_for_miniconda/miniconda3/envs/coffeagpu_env5/lib/python3.12/site-packages/awkward/_errors.py", line 80, in __exit__
raise self.decorate_exception(exception_type, exception_value)
File "/blue/p.chang/k.mohrman/dir_for_miniconda/miniconda3/envs/coffeagpu_env5/lib/python3.12/site-packages/awkward/_dispatch.py", line 67, in dispatch
next(gen_or_result)
File "/blue/p.chang/k.mohrman/dir_for_miniconda/miniconda3/envs/coffeagpu_env5/lib/python3.12/site-packages/awkward/operations/ak_argmin.py", line 79, in argmin
return _impl(array, axis, keepdims, mask_identity, highlevel, behavior, attrs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/blue/p.chang/k.mohrman/dir_for_miniconda/miniconda3/envs/coffeagpu_env5/lib/python3.12/site-packages/awkward/operations/ak_argmin.py", line 160, in _impl
out = ak._do.reduce(
^^^^^^^^^^^^^^
File "/blue/p.chang/k.mohrman/dir_for_miniconda/miniconda3/envs/coffeagpu_env5/lib/python3.12/site-packages/awkward/_do.py", line 294, in reduce
next = layout._reduce_next(
^^^^^^^^^^^^^^^^^^^^
File "/blue/p.chang/k.mohrman/dir_for_miniconda/miniconda3/envs/coffeagpu_env5/lib/python3.12/site-packages/awkward/contents/listoffsetarray.py", line 1618, in _reduce_next
outcontent = trimmed._reduce_next(
^^^^^^^^^^^^^^^^^^^^^
File "/blue/p.chang/k.mohrman/dir_for_miniconda/miniconda3/envs/coffeagpu_env5/lib/python3.12/site-packages/awkward/contents/numpyarray.py", line 1137, in _reduce_next
out = reducer.apply(self, parents, starts, shifts, outlength)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/blue/p.chang/k.mohrman/dir_for_miniconda/miniconda3/envs/coffeagpu_env5/lib/python3.12/site-packages/awkward/_reducers.py", line 172, in apply
array.backend[
File "/blue/p.chang/k.mohrman/dir_for_miniconda/miniconda3/envs/coffeagpu_env5/lib/python3.12/site-packages/awkward/_backends/cupy.py", line 39, in __getitem__
raise AssertionError(f"CuPyKernel not found: {index!r}")
AssertionError: CuPyKernel not found: ('awkward_reduce_argmin', <class 'numpy.int64'>, <class 'numpy.float64'>, <class 'numpy.int64'>)
See if this has been reported at https://github.com/scikit-hep/awkward/issues
Hi @kmohrman Indeed, argmin and argmax (and a couple more) reducers have not been implemented for the cuda backend yet. Now the wrong value you're getting for ak.min is indeed a bug. I would assume something in the kernel implementation.
I see you're reporting a lot of cuda backend bugs lately which is a very good thing. Thanks a lot! I would like to point out that it's good to test these things, but do not trust the cuda backend for an actual analysis of course. It's still under development.
https://github.com/scikit-hep/awkward/blob/939b46c26ef73fb5900eb9d94537e68ff666e97b/src/awkward/_connect/cuda/cuda_kernels/awkward_reduce_min.cu#L72
is this the vanilla atomicMin from CUDA? It looks like this only handles integer? https://stackoverflow.com/questions/17399119/how-do-i-use-atomicmax-on-floating-point-values-in-cuda
Edit: that's indeed the issue
@@ -69,7 +75,7 @@ awkward_reduce_min_b(
int64_t parent = parents[thread_id];
if (idx == blockDim.x - 1 || thread_id == lenparents - 1 || parents[thread_id] != parents[thread_id + 1]) {
- atomicMin(&toptr[parent], temp[thread_id]);
+ toptr[parent] = temp[thread_id];
}
}
fixes the repro in this issue. This is okay because grid_size is already set to one anyway for your input. Ofc in general we can't do this.
Thanks, @Moelf ! Could you please make a PR? Thanks!
the problem is this code path can potentially launch more than 1 threads, we need a general solution anyway because we use this in more than one place.
But I also see: https://github.com/scikit-hep/awkward/blob/939b46c26ef73fb5900eb9d94537e68ff666e97b/src/awkward/_connect/cuda/cuda_kernels/cuda_common.cu#L173-L179
I guess there's bug here?