llvm
llvm copied to clipboard
Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.
Reported by Coverity as CID `520793`, you can access Coverity results here: https://scan.coverity.com/projects/intel-llvm?tab=overview The following piece of code: https://github.com/intel/llvm/blob/faa61805224a993f8f9de6214ba0678afc9e53e4/unified-runtime/source/adapters/level_zero/event.cpp#L1585-L1588 Is located within a loop: https://github.com/intel/llvm/blob/faa61805224a993f8f9de6214ba0678afc9e53e4/unified-runtime/source/adapters/level_zero/event.cpp#L1552-L1553 Meaning, that we may call...
Suggested by @aelovikov-intel in https://github.com/intel/llvm/pull/20707#pullrequestreview-3493514286_: Have a local format.py in SYCLBIN such that ```c++ // REQUIRES-EXEC: ... // REQUIRES-OBJ: ... // REQUIRES-INPUT: ... // RUN-EXEC: // RUN-OBJ: // RUN-INPUT: //...
### Describe the bug ``` llvm/unittests/ExecutionEngine/Orc/ReOptimizeLayerTest.cpp failing // JIT session error: Symbols not found: [ __ImageBase ] // unknown file: error: SEH exception with code 0x3221225477 thrown in the test...
Which will let the driver create extra allocations in certain scenarios. Should we use ```zeContextCreateEx``` instead of ```zeContextCreate```? https://github.com/intel/llvm/blob/ad880488f093333d93e4025573587f7d522e79d5/sycl/plugins/unified_runtime/ur/adapters/level_zero/context.cpp#L32
Using checkout `clang version 17.0.0 (https://github.com/intel/llvm.git 23a6f389c1e45df077c6f15b691835b2976fda4d)` This may be a problem with the L0 adapter or with the L0 driver that I am using. Any advice would be appreciated....
Reported as CID `535426`, you can access the Coverity scan results here: https://scan.coverity.com/projects/intel-llvm?tab=overview https://github.com/intel/llvm/blob/b1528119119af75cc5403d082f43aa6cd7f47871/sycl/source/detail/adapter_impl.hpp#L295-L300 All destructors are implicitly `noexcept` and `Adapter->call` may throw. I understand that there is not much...
Level-zero `SubmitGraph` benchmarks support EmulateGraphs=0 (L0 record-and-replay) APIs and EmulateGraphs=1 (submitting command list to immediate command list). This PR adds the record-and-replay compute benchmark.
This PR removes `std::runtime_error` that dimensionality of cluster, global and local ranges must be same. I don't think that this should be restriction. I can't find anything in https://docs.nvidia.com/cuda/cuda-c-programming-guide/#thread-block-clusters Here...
https://github.com/intel/llvm/blob/8ab28e0d43a0d9bdff7210b39d04b891676009ee/sycl/test/check_device_code/extensions/properties/properties_cache_control.cpp#L61-L71 This code is supposed to check cache hints for load operation, but test only does one store. The same issue exists in other "read" checks. All checks in this...
### Describe the bug Currently the DPC++ driver uses `llc` to lower the IR module produced by `clang-offload-wrapper` to an object file. When doing so it ignores target features (none...