[CUDA] update default CUDA sm ver to 75
This pull request updates the default NVIDIA GPU architecture used for CUDA from SM_50 to SM_75, as the newest CUDA 13 no longer supports SM_50.
It also updates several tests:
- sets CUDA 10.0 as the default toolkit version used in driver-detection tests
- bumps the SM version to 75 and the PTX version to 63 for tests that look for specific output patterns
- adds support for the 3-operand atomic intrinsic
- adds support for the tanh.approx.f16/f16x2 intrinsic
@bratpiorka, please remove any changes from mlir/ project files. We don't support this project and these modifications impact pulldown from the upstream.
@bader thanks, removed
@aelovikov-intel @steffenlarsen @sarnex @mdtoguchi the CI pass and PR is ready to review
I see only 2 files owned by dpcpp-tools (not sure why though...):
llvm/lib/Target/NVPTX/NVPTXInstrInfo.tdllvm/lib/Target/NVPTX/NVPTXIntrinsics.tdQuestions: What is the plan regarding merging llvm/llvm-project#170679? Also, why above NVPTX files were not included in the upstream patch? Is it possible to align these files betweenllvm/llvm-projectandintel/llvm?
@YuriPlyakhin When I created an upstream version of this PR, I noticed there are many differences between our NVPTX*.td files and the upstream versions. My plan is to merge selected changes from upstream, but I’d like to do this in separate PRs, as it will likely require modifying more than just these two files in the SYCL repository. After that, I plan to update llvm/llvm-project#170679 with the final set of changes. Does this plan work for you?
I see only 2 files owned by dpcpp-tools (not sure why though...):
llvm/lib/Target/NVPTX/NVPTXInstrInfo.tdllvm/lib/Target/NVPTX/NVPTXIntrinsics.tdQuestions: What is the plan regarding merging llvm/llvm-project#170679? Also, why above NVPTX files were not included in the upstream patch? Is it possible to align these files betweenllvm/llvm-projectandintel/llvm?@YuriPlyakhin When I created an upstream version of this PR, I noticed there are many differences between our NVPTX*.td files and the upstream versions. My plan is to merge selected changes from upstream, but I’d like to do this in separate PRs, as it will likely require modifying more than just these two files in the SYCL repository. After that, I plan to update llvm/llvm-project#170679 with the final set of changes. Does this plan work for you?
Yes, that works. I approved for dpcpp-tools owned files.
@intel/llvm-gatekeepers please consider merging
Hi @bratpiorka ,
FYI, FreeFunctionKernels/structs_with_special_types_as_kernel_paramters.cpp started failing during PTX CodeGen after this PR, see https://github.com/intel/llvm/actions/runs/20262778001/job/58178580688
# | 1. Code generation
# | 2. Running pass 'Function Pass Manager' on module '/tmp/lit-tmp-8tjbtyf5/structs_with_special_types_as_kernel_paramters-sm_75-504b8a_0.bc'.
# | 3. Running pass 'Live Variable Analysis' on function '@_ZN23__sycl_kernel_local_acc17nsNdRangeFreeFuncENS_23StructWithLocalAccessorE'
# | Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
# | 0 clang-22 0x00005614714b60d2 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 66
# | 1 clang-22 0x00005614714b3092
# | 2 libc.so.6 0x00007fa412a92330
# | 3 clang-22 0x00005614707b94bb llvm::LiveVariables::HandleVirtRegUse(llvm::Register, llvm::MachineBasicBlock*, llvm::MachineInstr&) + 139
# | 4 clang-22 0x00005614707b9af9 llvm::LiveVariables::runOnInstr(llvm::MachineInstr&, llvm::SmallVectorImpl<llvm::Register>&, unsigned int) + 457
# | 5 clang-22 0x00005614707ba7f1 llvm::LiveVariables::runOnBlock(llvm::MachineBasicBlock*, unsigned int) + 1489
# | 6 clang-22 0x00005614707bb65f llvm::LiveVariables::analyze(llvm::MachineFunction&) + 927
# | 7 clang-22 0x00005614707bbd91
# | 8 clang-22 0x0000561470841770 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) + 400
# | 9 clang-22 0x0000561470ddc4f9 llvm::FPPassManager::runOnFunction(llvm::Function&) + 1593
# | 10 clang-22 0x0000561470ddc6a4 llvm::FPPassManager::runOnModule(llvm::Module&) + 52
# | 11 clang-22 0x0000561470ddba27 llvm::legacy::PassManagerImpl::run(llvm::Module&) + 1047
# | 12 clang-22 0x0000561471743a36 clang::emitBackendOutput(clang::CompilerInstance&, clang::CodeGenOptions&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) + 4886
# | 13 clang-22 0x0000561471d37a35 clang::CodeGenAction::ExecuteAction() + 837
Hi @bratpiorka , FYI,
FreeFunctionKernels/structs_with_special_types_as_kernel_paramters.cppstarted failing during PTX CodeGen after this PR, see https://github.com/intel/llvm/actions/runs/20262778001/job/58178580688
@uditagarwal97 If this issue only affects a single test, would it be acceptable for me to disable it, submit an issue, and fix it in the next pull request?
Hi @bratpiorka , FYI,
FreeFunctionKernels/structs_with_special_types_as_kernel_paramters.cppstarted failing during PTX CodeGen after this PR, see https://github.com/intel/llvm/actions/runs/20262778001/job/58178580688@uditagarwal97 If this issue only affects a single test, would it be acceptable for me to disable it, submit an issue, and fix it in the next pull request?
I agree. Please submit a GH issue, I already have a PR to XFAIL the test: https://github.com/intel/llvm/pull/20907