"-mno-xnack" flag is removed by upstream clang
Building and testing MIOpen with upstream clang, I got some failures. One is test_conv2d:
The output of AMD_COMGR_SAVE_TEMPS=1 AMD_COMGR_REDIRECT_LOGS=stdout AMD_COMGR_EMIT_VERBOSE_LOGS=1 /run/user/18014/portage/sci-libs/miopen-5.7.1-r1/work/MIOpen-rocm-5.7.1_build/bin/test_conv2d --float --cmode conv --pmode valid --group-count 1 --batch_size 8 --input_channels 32 --output_channels 32 --spatial_dim_elements 28 28 --filter_dims 5 5 --pads_strides_dilations 1 1 1 1 1 1 --trans_output_pads 0 0 --in_layout NCHW --fil_layout NCHW --out_layout NCHW --deterministic 0 --tensor_vect 0 --vector_length 1 --output_type int32 --int8_vectorize 0 is:
Original post is in https://github.com/ROCm/MIOpen/issues/990#issuecomment-2012468648 which points out that -mno-xnack is not supported by upstream clang. Maybe we should remove the "-target-feature", "-xnack" from optAsm rather than adding a -mno-xnack
@littlewu2508 Yep, it seems like after some recent changes in the compiler, -m[no-]<target-feature> can't be used with xnack, and XNACK should be specified in -mcpu. Ref: https://llvm.org/docs/AMDGPUUsage.html#id117. I am wondering why -m[no-]xnack normally works with latest ROCm.
I'll look into this. But please let us know hot urgent/important is this issue for you.
/cc @junliume @JehandadKhan
ℹ️ Copy of the discussion thread from #990
@littlewu2508:
The mno-xnack is removed in upstream clang llvm/llvm-project@5cae708 in 2020. However it seems MIOpen is still using that:
https://github.com/ROCm/MIOpen/blob/f34a90f7699e4af0e021ff0bf4111b1c407bdec1/src/comgr.cpp#L1013
And also, why MIOpen has its own comgr, rather than using https://github.com/ROCm/ROCm-CompilerSupport?
@atamazov:
The mno-xnack is removed in upstream clang
It was removed from HIP compiler, but we use it for AMDGPU assembler.
And also, why MIOpen has its own comgr, rather than using https://github.com/ROCm/ROCm-CompilerSupport?
MIOpen uses COMgr and HIPRTC libs to build kernels, no "its own" comgr.
@littlewu2508:
The mno-xnack is removed in upstream clang
It was removed from HIP compiler, but we use it for AMDGPU assembler.
Understood. So I think maybe we can remove the "-target-feature", "-xnack" from optAsm.
@littlewu2508 I am going to provide a fix for this issue soon. Please let me know if offline assembly (i.e. when the library uses clang executable directly) is working fine on your system.
How to switch from online to offline kernel compilation: just pass two additional options (-DMIOPEN_USE_COMGR=Off -DMIOPEN_USE_HIPRTC=Off) to CMake, then rebuild/reinstall MIOpen. Thanks!
@littlewu2508 #2891 created. Here is the diff if you would like to try it ASAP: https://github.com/ROCm/MIOpen/compare/develop...atamazov:gcnasm-noxnack-etc.diff
@littlewu2508 #2891 created. Here is the diff if you would like to try it ASAP: https://github.com/ROCm/MIOpen/compare/develop...atamazov:gcnasm-noxnack-etc.diff
Thanks! I backported it to 5.7 and it resolves the issue of unknown -mno-xnack.
Uploading miopen-5.7.1-remove-mno-xnack.patch.gz…
@littlewu2508 in order to close this we also need #2891 to be merged in, so I suggest reopening this.
in order to close this we also need https://github.com/ROCm/MIOpen/pull/2891 to be merged in, so I suggest reopening this.
You're right, this is not merged in develop branch yet
@junliume Please reopen this due to merged #3002.
@littlewu2508 To get the fix back, you can change value of WORKAROUND_ISSUE_3001 (see #3002) to 0 and rebuild the library.
Closing since MIOpen is being migrated to ROCm/rocm-libraries. Please re-open there if you believe this is still an issue.