unified-runtime icon indicating copy to clipboard operation
unified-runtime copied to clipboard

Results 324 unified-runtime issues
Sort by recently updated
recently updated
newest added

Note that this change includes a specification change: urProgramLink now requires the output parameter to contain either nullptr or some unspecified binary on failure. As well as this change, a...

loader
conformance
specification
experimental
level-zero
cuda
hip
opencl
native-cpu

This patch: - refactor options handling. - for use-after-free, do not try to get allocated/released info when quarantine is not enabled(no such info anyway). - for findAllocInfoByAddress(), add an assertion...

loader
sanitizer

L0 GPU runtime will divide the device memory address space equally among the all gpu devices. So, if there are multiple gpu devices, device sanitizer may not be able to...

loader
sanitizer

This PR try to implement the API `urKernelGetSuggestedLocalWorkSize`, discussed in https://github.com/oneapi-src/unified-runtime/issues/1270. SYCLOS PR: https://github.com/intel/llvm/pull/12902 Also fix: - For Level-Zero: when `LocalWorkSize` is provided, `urEnqueueKernelLaunch()` will read `LocalWorkSize` without respecting `workDim`.

loader
conformance
specification
level-zero
cuda
hip
opencl
ready to merge
native-cpu
sanitizer

CI in LLVM/SYCL: https://github.com/intel/llvm/pull/14536

level-zero

Since EvStart and EvEnd are recorded directly after one another in `urEnqueueTimestampRecordingExp`, we can just copy EvStart to make EvEnd, instead of calling cuEventRecord for both `EvStart` and `EvEnd`, one...

cuda
hip

I recently created two PRs (#1508 and #1509) which were simple changes to [.github/labeler.yml](https://github.com/oneapi-src/unified-runtime/blob/main/.github/labeler.yml) which triggered the full CI pipeline. These have no effect on UR spec/source code at all...

ci/cd

Is there a SYCL function for cudaOccupancyMaxActiveBlocksPerMultiprocessor ? some use cases are listed below. Thanks. AITemplate/3rdparty/cutlass/include/cutlass/gemm/device/gemm_universal_adapter.h: result = cudaOccupancyMaxActiveBlocksPerMultiprocessor( AITemplate/3rdparty/cutlass/include/cutlass/gemm/device/gemm_universal_base.h: cudart_result = cudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags( AITemplate/3rdparty/cutlass/include/cutlass/gemm/device/gemm_universal_base.h: CUTLASS_TRACE_HOST(" cudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags() returned error "

cuda

Also attempt to clarify the wording around this a bit. Addresses #558 LLVM testing https://github.com/intel/llvm/pull/12270

loader
conformance
specification
level-zero
cuda
hip
opencl
native-cpu
sanitizer