libcudacxx icon indicating copy to clipboard operation
libcudacxx copied to clipboard

Removed block of complex multiply checks

Open cliffburdick opened this issue 1 year ago • 7 comments

Addresses issue #306 pending verification that none of these checks break standards conformance. As noted in the linked issue, a large performance penalty can happen in certain applications that heavily use complex multiplies by introducing extra branches and instructions. This PR only performs the multiple without checking for NaN.

cliffburdick avatar Aug 26 '22 14:08 cliffburdick

Thanks Cliff! I think this is a reasonable change, but I'll need to defer to the language lawyers to say for sure.

@ericniebler @brycelelbach could you confirm if this is a conforming change?

jrhemstad avatar Aug 26 '22 19:08 jrhemstad

I've never been much of a floating point expert. Without doing some research, I can't tell you whether this is conforming or not.

ericniebler avatar Aug 27 '22 01:08 ericniebler

Any thoughts on wrapping the removed lines with this:

NV_IF_TARGET(
   NV_IS_HOST,
.... checks ....
)

This way we can keep the slow path intact on CPU? There could be libcxx tests that expect some result from these checks.

I am content to merge as-is though, I'll run tests and see if anything barfs.

wmaxey avatar Aug 27 '22 19:08 wmaxey

Any thoughts on wrapping the removed lines with this:

NV_IF_TARGET(
   NV_IS_HOST,
.... checks ....
)

This way we can keep the slow path intact on CPU? There could be libcxx tests that expect some result from these checks.

I am content to merge as-is though, I'll run tests and see if anything barfs.

Is there precedent for doing different things on the host and device? Would it be better to have a version in cuda::complex that removes all the checks (not just for multiply).

cliffburdick avatar Aug 27 '22 20:08 cliffburdick

Standard seems to indicate it is conforming and the checks are unnecessary:

"If the result of a function is not mathematically defined or not in the range of representable values for its type, the behavior is undefined" https://eel.is/c++draft/complex.numbers#general-3.sentence-1

The checks should be an opt-in as the impact of doing those checks impact a lot more people negatively than not doing them does. This would also match libstdc++ behavior.

There are other checks in other operators that we may also want to remove.

luitjens avatar Aug 30 '22 21:08 luitjens

Hey @cliffburdick, after internal discussion, I don't think we are ready to fully commit to removing these checks by default.

I think as a short-term solution, we could guard these checks behind a macro that allows you to opt-out of them. Maybe something like __LIBCUDACXX_NO_IEC_559_COMPLEX_.

jrhemstad avatar Aug 30 '22 21:08 jrhemstad

Hey @cliffburdick, after internal discussion, I don't think we are ready to fully commit to removing these checks by default.

I think as a short-term solution, we could guard these checks behind a macro that allows you to opt-out of them. Maybe something like __LIBCUDACXX_NO_IEC_559_COMPLEX_.

Thanks. Do we need to wait for consensus on the macro name and/or the way we set it?

cliffburdick avatar Sep 02 '22 21:09 cliffburdick