Jakub Chlanda
Jakub Chlanda
@pasaulais I don't think so, doesn't `BuildMI` take care of that? In any case this was only an interface change `expandPostRAPseudo` used to take `MI` as an iterator, so functionality...
@skambapugithub I've looked into the sample you've provided. It seems to me that clang correctly generates floating point fused multiply add instructions, so I would not worry about those flags....
@skambapugithub I've looked a bit further into the vector loads/stores, I think clang is right not generating them, from [PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#vectors): > By default, vector variables are aligned to a...
I've looked into similar applications and I think there is an inherent problem with the way `marray` is implemented; on the contrary to built-in vector types the alignment of `marray`...
Hi @anton-v-gorshkov, I've looked into the issue a bit. The problem is not with the compiler failing to unroll the loop, it is being correctly unrolled, see a debug snippet...
I've just checked with the top of llvm/sycl [f963](https://github.com/intel/llvm/commit/f963062cf9b23aadbd2d4a976b20f51ae7c51d23) and top of CTS [da7b](https://github.com/KhronosGroup/SYCL-CTS/commit/da7b9047fb47fe6e55e92ba92efddb4a272c99dd) and the swizzle tests build just fine.
Huh, so `test_optional_kernel_features` fails with the same missing symbol, I'll re-open the ticket. ``` ptxas fatal : Unresolved extern function '_Z18__spirv_AtomicIAddPyN5__spv5Scope4FlagENS0_19MemorySemanticsMask4FlagEy' ```
This will be fixed when https://github.com/intel/llvm/pull/7220 gets merged.
https://github.com/intel/llvm/pull/7220 was fixed in https://github.com/intel/llvm/pull/7723
Hi @skambapugithub I've revisited this issue with a [recent build](https://github.com/intel/llvm/commit/3df87e20569ea63d0a74de525b6d19788dd8afca) and was able to run both CUDA and SYCL versions on a `Tesla V100-SXM2-16GB` node. I do not see the...