Andrey Alekseenko
Andrey Alekseenko
**Describe the bug** With ROCm 4.5.2, trying to call `device.get_info()` on an AMD device throws `cl::sycl::runtime_error`. **To Reproduce** ```cpp #include #include int main() { std::vector devices = sycl::device::get_devices(); for (const...
**Describe the bug** After program completion, when all the resources are getting deinitialized, it aborts with "corrupted double-linked list". The error is semi-random. Seems much more likely to be triggered...
**Describe the bug** When compiling a file for the HIP backend (`-fsycl-targets=amdgcn-amd-amdhsa`) it is necessary to specify `--offload-target`. However, when trying to compile for the CUDA backend in the same...
**Describe the bug** Templated code that sometimes passes a local accessor to the kernel and sometimes passes a `nullptr` fails with a runtime error during kernel setup when running on...
**Describe the bug** I want to submit many tasks to the device queue asynchronously, minimizing the CPU time: - Do some operations on bufferB in queue B, synchronize with it....
**Bug summary** In some group functions for CUDA backend, hipSYCL uses `__activemask` intrinsic to detect currently active threads, e.g. here: https://github.com/illuhad/hipSYCL/blob/develop/include/hipSYCL/sycl/libkernel/cuda/group_functions.hpp#L99 However, per the section "Active Mask Query" in https://developer.nvidia.com/blog/using-cuda-warp-level-primitives/,...
If `__HIPSYCL_FORCE_INLINE_ALL__` is defined, the hipSYCL plugin will attach always_inline attribute to all functions called from kernels. For some reason, adding the `flatten` attribute to kernels does not help. For...
**Bug summary** The presence of `nd_item::_offset` field, in some cases, noticeably increases register usage by the kernel. Usually, the compiler does a good enough job optimizing things, but not always....
Currently, the documentation says: > Note that all current Nvidia devices return 32 for this variable, and all current AMD devices return 64. > (https://github.com/RadeonOpenCompute/ROCm_Documentation/blob/d54ddbd43dcc434211c55451445093e4c6a5bb07/Programming_Guides/Kernel_language.rst#warpsize) However, this is not the...