libcudacxx
libcudacxx copied to clipboard
Removed block of complex multiply checks
Addresses issue #306 pending verification that none of these checks break standards conformance. As noted in the linked issue, a large performance penalty can happen in certain applications that heavily use complex multiplies by introducing extra branches and instructions. This PR only performs the multiple without checking for NaN.
Thanks Cliff! I think this is a reasonable change, but I'll need to defer to the language lawyers to say for sure.
@ericniebler @brycelelbach could you confirm if this is a conforming change?
I've never been much of a floating point expert. Without doing some research, I can't tell you whether this is conforming or not.
Any thoughts on wrapping the removed lines with this:
NV_IF_TARGET(
NV_IS_HOST,
.... checks ....
)
This way we can keep the slow path intact on CPU? There could be libcxx tests that expect some result from these checks.
I am content to merge as-is though, I'll run tests and see if anything barfs.
Any thoughts on wrapping the removed lines with this:
NV_IF_TARGET( NV_IS_HOST, .... checks .... )
This way we can keep the slow path intact on CPU? There could be libcxx tests that expect some result from these checks.
I am content to merge as-is though, I'll run tests and see if anything barfs.
Is there precedent for doing different things on the host and device? Would it be better to have a version in cuda::complex
that removes all the checks (not just for multiply).
Standard seems to indicate it is conforming and the checks are unnecessary:
"If the result of a function is not mathematically defined or not in the range of representable values for its type, the behavior is undefined" https://eel.is/c++draft/complex.numbers#general-3.sentence-1
The checks should be an opt-in as the impact of doing those checks impact a lot more people negatively than not doing them does. This would also match libstdc++ behavior.
There are other checks in other operators that we may also want to remove.
Hey @cliffburdick, after internal discussion, I don't think we are ready to fully commit to removing these checks by default.
I think as a short-term solution, we could guard these checks behind a macro that allows you to opt-out of them. Maybe something like __LIBCUDACXX_NO_IEC_559_COMPLEX_
.
Hey @cliffburdick, after internal discussion, I don't think we are ready to fully commit to removing these checks by default.
I think as a short-term solution, we could guard these checks behind a macro that allows you to opt-out of them. Maybe something like
__LIBCUDACXX_NO_IEC_559_COMPLEX_
.
Thanks. Do we need to wait for consensus on the macro name and/or the way we set it?