Rafal Bielski

Results 12 issues of Rafal Bielski

I would love to be able to select: * language: C++ * compiler: icx (recent versions) * compiler flags: `-fsycl -fsycl-targets=nvptx64-nvidia-cuda` and successfully compile and browse PTX device code. It...

### Describe the bug The initial value for `sycl::minimum` is set to `inf` which gets turned into `0` with `-ffast-math`. More generally, both min and max are affected for any...

bug
confirmed

## Environment Information - UMF version (hash commit or a tag): main - OS(es) version(s): latest Alpine Linux - kernel version(s): 6.5.0-15-generic - compiler, libraries, and other related tools version(s):...

bug

Use the wrapper function from `infrastructure/SYCL.h` (introduced in #95) to call either `host_task` or native command submission extensions when available. All changes are exclusively in the code paths taken by...

**TLDR:** OpenCL adapter implementation of `urEnqueueUSMFill` calls `clEnqueueMemFillINTEL` for power-of-2 pattern size without checking destination memory alignment required by `clEnqueueMemFillINTEL`. **Full Story:** I ran into this issue when trying to...

bug
opencl

Coming from #1299 which originally included a change of the error code, but upon further discussion with @GeorgeWeb we agreed the error handling improvement should be a separate PR, paired...

cuda

Building on top of https://github.com/intel/llvm/pull/12604 + https://github.com/oneapi-src/unified-runtime/pull/1318 which adds `handleOutOfResources` to dpcpp and returns `UR_RESULT_ERROR_OUT_OF_RESOURCES`, the local mem size check: https://github.com/oneapi-src/unified-runtime/blob/f086f369cab557bf2a589e22bfc37e18d7de5fa8/source/adapters/cuda/enqueue.cpp#L294-L298 should also return `UR_RESULT_ERROR_OUT_OF_RESOURCES` and have dedicated error handling...

cuda

Two changes improving the CI configuration running on Arm CPUs: ### 1. Use apt to install ArmPL in the CI Install ArmPL from apt (as per [instructions](https://learn.arm.com/install-guides/armpl/)) instead of downloading...

### Describe the bug The extensions documentation mandates: https://github.com/intel/llvm/blob/b91d3e2be018c4bf55a4612b074a1d6214828c8b/sycl/doc/extensions/README-process.md > Each extension also has a feature-test macro, which is the same as the extension's name, except it uses all upper...

bug
spec extension
confirmed

Improve SYCL performance on CUDA and HIP backends with the two changes below. There is no functional change for Intel backends. #### 1. Add CMake option to use in-order queue...