hcc icon indicating copy to clipboard operation
hcc copied to clipboard

SSE2 Intrinsics cause compiler error

Open dragontamer opened this issue 6 years ago • 3 comments

I've got some simple SSE2 code here.

 #include <immintrin.h>
 
 int main(){
 // Uncomment the "#ifdef" to fix the issue
 //#ifdef __HCC_CPU__
     __m128i arb = _mm_undefined_si128();
     __m128i zero = _mm_xor_si128(arb, arb);
 
     return _mm_extract_epi64(zero, 0);
 //#endif
 }

I compile as:

hcc `hcc-config --cxxflags --ldflags` -msse2 example.cpp -o example

And it outputs the error:

example.cpp:4:19: error: always_inline function '_mm_undefined_si128' requires
      target feature 'sse2', but would be inlined into function 'main' that is
      compiled without support for 'sse2'
    __m128i arb = _mm_undefined_si128();
                  ^
example.cpp:5:20: error: always_inline function '_mm_xor_si128' requires target
      feature 'sse2', but would be inlined into function 'main' that is compiled
      without support for 'sse2'
    __m128i zero = _mm_xor_si128(arb, arb);
                   ^
example.cpp:7:12: error: '__builtin_ia32_vec_ext_v2di' needs target feature sse2
    return _mm_extract_epi64(zero, 0);
           ^
/opt/rocm/hcc/lib/clang/8.0.0/include/smmintrin.h:1097:14: note: expanded from
      macro '_mm_extract_epi64'
  (long long)__builtin_ia32_vec_ext_v2di((__v2di)(__m128i)(X), (int)(N))

I'm able to get the compiler to run with #ifdef __HCC_CPU__ , but I doubt that is what is intended. My hcc version is as follows:

hcc --version
HCC clang version 8.0.0 (ssh://gerritgit/compute/ec/hcc-tot/clang 6ec3c61e09fbb60373eaf5a40021eb862363ba2c) (ssh://gerritgit/lightning/ec/llvm ab3b88ffc2ae50f55361a49aec89f6e95d9d0ec4) (based on HCC 1.3.18482-757fb49-6ec3c61-ab3b88f )
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/rocm/bin

dragontamer avatar Dec 31 '18 07:12 dragontamer

This is actually expected behaviour. Those builtins are only available for intel targets. HCC is a single-source CPU/GPU compiler. So there are two passes; one for CPU (host) and one for GPU (device) targets. On the second pass, the target will be amdgcn, where these builtins are not supported.

david-salinas avatar Feb 01 '19 21:02 david-salinas

This is actually expected behaviour. Those builtins are only available for intel targets. HCC is a single-source CPU/GPU compiler. So there are two passes; one for CPU (host) and one for GPU (device) targets. On the second pass, the target will be amdgcn, where these builtins are not supported.

Thanks for getting back to me on this.

I guess what I expected instead, was for only [[HC]] labeled functions to be compiled in the 2nd pass. I would expect that only a minority of code would be for GPUs (the minority which is called by other [[HC]] functions).

I guess, when I use HCC intrinsics or inline-assembly in a [[hc]] function, there's no compiler error. So I was hoping that x86 intrinsics could be used in a similar manner.

dragontamer avatar Feb 01 '19 22:02 dragontamer

I too am getting this error when compiling sse2 intrinsic. What is the solution when using hcc?

rrawther avatar Feb 18 '20 22:02 rrawther