thrust `__noinline__` Macro Definition causes `clang++` Compile Error

I'm working with clang++ 13.0 and CUDA Toolkit 11.6. It seems to me that there's probably some problem with the __noinline__ macro. In thrust, it is used as __attribute__((__noinline__)) which expects __noinline__ expand to noinline. However, with clang++, __noinline__ expands to __attribute__((noinline)), which makes __attribute__((__attribute__((noinline)))) and cause a compile error. This happens when I compile with following arguments with main.cu including <thust/system/cuda/pointer.h>.

clang++ --cuda-gpu-arch=sm_86 -std=c++17 -o main main.cu

May 23 '22 20:05 explocion

Here's part of the compile output:

In file included from /opt/cuda/include/thrust/system/cuda/pointer.h:27:
In file included from /opt/cuda/include/thrust/detail/reference.h:25:
In file included from /opt/cuda/include/thrust/system/detail/adl/assign_value.h:42:
In file included from /opt/cuda/include/thrust/system/cuda/detail/assign_value.h:25:
In file included from /opt/cuda/include/thrust/system/cuda/detail/copy.h:100:
In file included from /opt/cuda/include/thrust/system/cuda/detail/internal/copy_cross_system.h:42:
In file included from /opt/cuda/include/thrust/detail/temporary_array.h:40:
In file included from /opt/cuda/include/thrust/detail/contiguous_storage.h:21:
In file included from /opt/cuda/include/thrust/detail/allocator/allocator_traits.h:29:
In file included from /opt/cuda/include/thrust/detail/memory_wrapper.h:29:
In file included from /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/12.1.0/../../../../include/c++/12.1.0/memory:77:
In file included from /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/12.1.0/../../../../include/c++/12.1.0/bits/shared_ptr.h:53:
/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/12.1.0/../../../../include/c++/12.1.0/bits/shared_ptr_base.h:196:22: error: use of undeclared identifier 'noinline'; did you mean 'inline'?
      __attribute__((__noinline__))
                     ^
/opt/cuda/include/crt/host_defines.h:83:24: note: expanded from macro '__noinline__'
        __attribute__((noinline))

May 23 '22 20:05 explocion

i had the same issues. after changing the macro in /opt/cuda/include/crt/host_defines.h:83

from #define __noinline__ __attribute__((noinline))

to #define __noinline__ noinline

the code's compiling

Jun 07 '22 11:06 hiaselhans

After doing some research, I found that compiling with -stdlib=libc++ works just fine. It seems that clang by default links STL code with libstdc++, which is a part of g++, where the macro __noinline__ is different from thrust's definition.

Jun 07 '22 14:06 explocion

__noinline__ is not a macro in libstdc++, it's an attribute name i.e. effectively a keyword. It's used as __attribute__((__noinline__)) which has been valid in GCC for many many years. GCC has always allowed __name__ for an attribute token, as the safe way for the implementation to refer to an attribute without clashing with a macro name in the program.

Jul 25 '22 16:07 jwakely

I am not sure whether this is really a thrust issue. We are not defining __noinline__ in thrust and the code in question boils down to standard includes.

Jul 27 '22 09:07 miscco

The issue stems from the fact that CUDA headers define __noinline__ (which should be a reserved compiler keyword) as a macro for the noinline attribute. This conflicts with recent changes in the GCC 12 standard headers, where __noinline__ is used as the attribute name (the standard headers cannot us __attribute__((noinline)), because noinline is not a reserved keyword, so they use __attribute__((__noinline__)) instead).

A possible workaround is to undefine the __noinline__ macro before including the system headers, and redefine it afterwards.

Jul 27 '22 17:07 Oblomov

Is there any progress regarding this issue? I'd like to use clang++ with libstdc++, but it seems impossible without modification of headers.

Aug 22 '22 21:08 JRazek

It looks like this was caused by a recent change in libstdc++ here: https://github.com/gcc-mirror/gcc/commit/dbf8bd3c2f2cd2d27ca4f0fe379bd9490273c6d7#diff-b358f609a31a4af8af72cc3197566abaa157bb7f8681b45580f1e5477540457cR192-R193

However, this issue is unique to clang as nvcc compiles the equivalent just fine:

__attribute__ (( __noinline__ )) void foo();

https://godbolt.org/z/Y6h99GcWb

This issue is unique to clang and Thrust does not officially support clang as a CUDA device compiler.

We are happy to review and accept any PRs from the community that fix this problem without breaking any of our supported compiler platforms.

However, I don't believe there is anything we can do in Thrust to address this issue because as https://godbolt.org/z/Y6h99GcWb shows, this problem exists in clang without including any Thrust headers.

Aug 23 '22 14:08 jrhemstad

The issue is not related to thrust project, closing it.

Feb 23 '23 16:02 gevtushenko

It looks like this was caused by a recent change in libstdc++ here: [gcc-mirror/gcc@dbf8bd3#diff-b358f609a31a4af8af72cc3197566abaa157bb7f8681b45580f1e5477540457cR192-R193](https://github.com/gcc-mirror/gcc/commit/dbf8bd3c2f2cd2d27ca4f0fe379bd9490273c6d7#diff- However, this issue is unique to clang as nvcc compiles the equivalent just fine:
__attribute__ (( __noinline__ )) void foo();

It compiles, but it's not fine:

$echo '__attribute__((__noinline__)) void foo();' | nvcc -x cu -dD -E - | tail -1
__attribute__((__attribute__((noinline)))) void foo();

AFAICT, nvcc just ignores an unknown attribute with the name __attribute__((noinline)) expanded from __noinline__.

Apr 27 '23 21:04 Artem-B

Indeed, I believe the nvcc frontend has special handling for that attribute expansion. clang would need to emulate that "special" handling :slightly_smiling_face:

Apr 27 '23 21:04 jrhemstad

Right. The __attribute__((__attribute__((noinline)))) void foo(); gets magically transformed into __attribute((noinline)) void foo() by the time it makes it to the final host compilation. 😭

And the magic seems to work only for __attribute__((__attribute__((noinline)))). Any other variants I tried error out.

So, it's been a known issue in the CUDA headers, deliberately worked around in NVCC. And now the bug lives on and keeps giving...

Apr 27 '23 22:04 Artem-B

So, it's been a known issue in the CUDA headers, deliberately worked around in NVCC.

NVIDIA engineers “fixed” the compiler instead of fixing the header files? That's brilliant!

Sep 29 '23 02:09 intractabilis

thrust
thrust copied to clipboard

`noinline` Macro Definition causes `clang++` Compile Error

thrust thrust copied to clipboard

`__noinline__` Macro Definition causes `clang++` Compile Error

thrust
thrust copied to clipboard

`noinline` Macro Definition causes `clang++` Compile Error