cupy.sign(NaN) == 0, unlike numpy
Description
numpy.sign(numpy.nan) returns NaN, but cupy.sign(cupy.nan) returns 0.
I expected either the behavior to be the same, or the difference to be documented. (Edit: I see now that the different behavior is documented, but not explicitly contrasted with numpy in the documentation. Apologies if this is better suited as a documentation enhancement, although I would prefer the matching behavior.)
To Reproduce
import numpy as np
import cupy as cp
# Second assertion fails
assert np.isnan(np.sign(np.nan))
assert cp.isnan(cp.sign(cp.nan))
Installation
Wheel (pip install cupy-***)
Environment
OS : Windows-10-10.0.19045-SP0
Python Version : 3.11.6
CuPy Version : 13.0.0
CuPy Platform : NVIDIA CUDA
NumPy Version : 1.24.0
SciPy Version : 1.11.2
Cython Build Version : 0.29.36
Cython Runtime Version : 3.0.2
CUDA Root : C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2
nvcc PATH : C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2\bin\nvcc.EXE
CUDA Build Version : 12020
CUDA Driver Version : 12020
CUDA Runtime Version : 12020 (linked to CuPy) / 12020 (locally installed)
cuBLAS Version : (available)
cuFFT Version : 11008
cuRAND Version : 10303
cuSOLVER Version : (11, 5, 0)
cuSPARSE Version : (available)
NVRTC Version : (12, 2)
Thrust Version : 200200
CUB Build Version : 200200
Jitify Build Version : b0269c8
cuDNN Build Version : (not loaded; try `import cupy.cuda.cudnn` first)
cuDNN Version : (not loaded; try `import cupy.cuda.cudnn` first)
NCCL Build Version : None
NCCL Runtime Version : None
cuTENSOR Version : None
cuSPARSELt Build Version : None
Device 0 Name : NVIDIA GeForce RTX 4090
Device 0 Compute Capability : 89
Device 0 PCI Bus ID : 0000:C1:00.0
Additional Information
No response
It looks like the kernel is defined as
((x > 0) - (x < 0))
numpy uses the expression in1 > 0 ? 1 : (in1 < 0 ? -1 : (in1 == 0 ? 0 : in1)) code here
Some brief testing in godbolt shows that with nvcc 11.7, the numpy expression compiles into the same number or fewer of PTX instructions, without any branches. (Even fewer with floating point if the (in1 == 0 ? 0 : in1) is replaced with the equivalent in1.)
@prutschman-iv Thanks for reporting this one and also checking the performance impact! Indeed it's better to fix this one. Would you be interested in submitting a pull request to address this bug?
I'll attempt to prepare a pull request with appropriate tests. For the tests, it looks like one approach would be adding something like this to test_misc.py. Does that seem right?
def test_sign_inf_nan(self):
self.check_unary_inf_nan("sign")
(Also, I just realized that (in1 == 0 ? 0 : in1) isn't equivalent to in1 in the case of -0. So, the longer form of the expression is necessary to match numpy.)
I must be misunderstanding some aspect of the test suite. I tried adding the test_sign_inf_nan and running the test suite with python setup.py test without fixing the bug, but I didn't get a test failure. I defined the test as assert False and didn't get a failure. So, I think my new test isn't getting run.
(I'm currently using the v12 branch because I couldn't get the main branch to build due to a problem with dlpack)
Thanks @prutschman-iv! The test looks legitimate to me. To run the test please use python -m pytest tests/cupy_tests/math_tests/test_misc.py. As for the dlpack issue, since the git submodule path has been changed between these branches, I'd suggest starting over from the fresh clone.
I was successful in running the test suite for v12, thank you. From there I also observed a divergence between cupy and numpy for complex numbers, so I've fixed that as well. From the discussion linking to this one it sounds like there is maybe some controversy about the "correct" behavior? But my fix causes the behavior to match numpy's behavior as of 1.26.
The efficiency story isn't as rosy for the complex numbers case; it appears that the existing code was complicated enough to cause branches rather than predicated execution already and the extra test for nan does not improve the situation.
Unfortunately, I'm still having trouble building on main, even on a completely fresh clone into an new directory. For some reason the file cupy/_core/include\cupy/_dlpack/dlpack.h contains only the literal string ../../../../../third_party/dlpack/include/dlpack/dlpack.h I am hesitant to open the PR until I've verified correct operation on the main branch.
@prutschman-iv feel free to file a draft PR so that one of us can look at it and help you verify the issues :)
Unfortunately, I'm still having trouble building on main, even on a completely fresh clone into an new directory. For some reason the file cupy/_core/include\cupy/_dlpack/dlpack.h contains only the literal string ../../../../../third_party/dlpack/include/dlpack/dlpack.h I am hesitant to open the PR until I've verified correct operation on the main branch.
On Windows, symlink support needs to be enabled manually: https://github.com/cupy/cupy/issues/7989#issuecomment-1826470128
I'm not trying to pester, but I wanted to point out that I've removed the draft status from my PR, just in case that notification isn't automatic.