ROCm-CompilerSupport icon indicating copy to clipboard operation
ROCm-CompilerSupport copied to clipboard

ROCm 2.0 ` Call to hsa_executable_load_code_object returned HSA_STATUS_ERROR_INVALID_CODE_OBJECT`

Open stuartarchibald opened this issue 6 years ago • 2 comments

I'm afraid this is going to be a bit vague. Essentially, I had the comgr tooling working as a compilation pipeline for Numba. I was forced to upgrade to ROCm 2.0 following updates to the base Centos 7 OS which rendered the ROCm 1.9.x release bizarrely broken (absolutely no call to HSA* worked, and the /opt/rocm/bin tools showed strange things). Having now fixed up the Numba code against the comgr that shipped with ROCm 2.0, the compilation with comgr succeeds but then does:

 Call to hsa_executable_load_code_object returned HSA_STATUS_ERROR_INVALID_CODE_OBJECT

when a load of the executable object is attempted.

Further, the command line tools and libraries for doing compilation shipped in Numba's rocmtools package continue to work (despite being 1.9.x toolchain based). This suggests the drivers/runtime etc are still working as expected and it is indeed a problem with the ELF?

Any suggestions as to likely causes of the problem would be welcomed. Thanks.

stuartarchibald avatar Dec 21 '18 14:12 stuartarchibald

This is a known bug in the loader which we are working on. You can workaround it by passing "-mno-code-object-v3" in the option string during compilation. We hope to have a fix by the end of next week.

scott-linder avatar Dec 21 '18 16:12 scott-linder

Thanks for this, I've added that line in, seems like it makes compilation succeed but the relocatables generated either don't contain the metadata expected/declared by the spec.

  • For a -v3 style object I can programmatically obtain that the root is a map and has size 1. The Version node is the only thing in the map and can be parsed as two string nodes 1 and 0.

  • For a +v3 style object I can programmatically obtain that the root is a map and has size 2, it would appear both amdhsa.version and amdhsa.kernels are in there, however, due to the above bug I cannot really test this in practice (at present, I've just compiled a +v3 DSO and shoehorned the bytes in at the appropriate location.

stuartarchibald avatar Dec 31 '18 17:12 stuartarchibald

This should be fixed in recent releases. If anyone is still hitting an issue with this please comment and I can reopen!

lamb-j avatar Mar 31 '23 17:03 lamb-j