rocBLAS icon indicating copy to clipboard operation
rocBLAS copied to clipboard

[Bug]: rocblas cannot load TensileLibrary.dat

Open G-Ragghianti opened this issue 2 years ago • 12 comments

Describe the bug

Code using rocblas is unable to load TensileLibrary.dat with error:

rocBLAS error: Cannot read
/opt/rocm/rocblas/library/TensileLibrary.dat: Illegal seek

To Reproduce

This occurs when using the RPM packages for rocm 5.2 and newer. Version 5.1 includes this file, but it appears to have been removed in later versions.

Expected behavior

The RPM packages should include TensilLibrary.dat or some way of automatically creating this file?

Environment

Hardware description
CPU AMD EPYC 7413
GPU MI-210
Software version
rocm-core v5.2
rocblas v5.2

Attached environment.txt environment.txt

G-Ragghianti avatar Sep 21 '22 19:09 G-Ragghianti

@G-Ragghianti, In ROCm 5.2, TensileLibrary.dat file was split into multiple files based on GPU architecture. So, you might find files such as TensileLibrary_gfx906.dat , TensileLibrary_gfx90a.dat etc. in /opt/rocm/rocblas/library/ path.

To further debug the issue, would it possible for you to run the following command and provide the output. ldd <user_app>

I am curious to see which version of rocblas does the application use.

rkamd avatar Sep 22 '22 19:09 rkamd

Yes, I noticed that TensileLibrary.dat wasn't provided since version 5.2, but the code in rocblas seems to still reference it as a fallback if there is a failure to read the arch-specific dat file. I'm prettry sure that we are building and running only on version 5.2, but I've requested my collaborator to post his ldd output. Thanks.

G-Ragghianti avatar Sep 27 '22 20:09 G-Ragghianti

$ ldd test/tester
        linux-vdso.so.1 =>  (0x00007ffc59365000)
        libslate.so => /home/kadir/dopamine/t020/slate-dev/lib/libslate.so (0x00007f4f3b9a2000)
        libtestsweeper.so => /home/kadir/dopamine/t020/slate-dev/testsweeper/libtestsweeper.so (0x00007f4f3b792000)
        libmkl_scalapack_lp64.so.2 => /nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-oneapi-mkl-2022.0.2-csli2oqmbu6esrsvxrnenbde52w7touz/mkl/2022.0.2/lib/intel64/libmkl_scalapack_lp64.so.2 (0x00007f4f3b065000)
        libmkl_blacs_intelmpi_lp64.so.2 => /nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-oneapi-mkl-2022.0.2-csli2oqmbu6esrsvxrnenbde52w7touz/mkl/2022.0.2/lib/intel64/libmkl_blacs_intelmpi_lp64.so.2 (0x00007f4f3ca4b000)
        libblaspp.so => /home/kadir/dopamine/t020/slate-dev/blaspp/lib/libblaspp.so (0x00007f4f3aded000)
        liblapackpp.so => /home/kadir/dopamine/t020/slate-dev/lapackpp/lib/liblapackpp.so (0x00007f4f3aa94000)
        libmkl_gf_lp64.so.2 => /nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-oneapi-mkl-2022.0.2-csli2oqmbu6esrsvxrnenbde52w7touz/mkl/2022.0.2/lib/intel64/libmkl_gf_lp64.so.2 (0x00007f4f39bf6000)
        libmkl_sequential.so.2 => /nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-oneapi-mkl-2022.0.2-csli2oqmbu6esrsvxrnenbde52w7touz/mkl/2022.0.2/lib/intel64/libmkl_sequential.so.2 (0x00007f4f381dc000)
        libmkl_core.so.2 => /nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-oneapi-mkl-2022.0.2-csli2oqmbu6esrsvxrnenbde52w7touz/mkl/2022.0.2/lib/intel64/libmkl_core.so.2 (0x00007f4f33e27000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f4f33c0b000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f4f33a07000)
        librocsolver.so.0 => /opt/rocm/lib/librocsolver.so.0 (0x00007f4f0040b000)
        librocblas.so.0 => /opt/rocm/lib/librocblas.so.0 (0x00007f4ef2a07000)
        libamdhip64.so.5 => /opt/rocm/lib/libamdhip64.so.5 (0x00007f4ef1b34000)
        libmpicxx.so.12 => /nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/compilers_and_libraries_2020.2.254/linux/mpi/intel64/lib/libmpicxx.so.12 (0x00007f4ef1914000)
        libmpifort.so.12 => /nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/compilers_and_libraries_2020.2.254/linux/mpi/intel64/lib/libmpifort.so.12 (0x00007f4ef1555000)
        libmpi.so.12 => /nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/compilers_and_libraries_2020.2.254/linux/mpi/intel64/lib/release/libmpi.so.12 (0x00007f4ef0339000)
        librt.so.1 => /lib64/librt.so.1 (0x00007f4ef0131000)
        libstdc++.so.6 => /nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/gcc-7.3.0-z3dl67mmil3qe2z5loqk6l5denetg5x7/lib64/libstdc++.so.6 (0x00007f4eefdaf000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f4eefaad000)
        libgomp.so.1 => /nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/gcc-7.3.0-z3dl67mmil3qe2z5loqk6l5denetg5x7/lib64/libgomp.so.1 (0x00007f4eef87f000)
        libgcc_s.so.1 => /nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/gcc-7.3.0-z3dl67mmil3qe2z5loqk6l5denetg5x7/lib64/libgcc_s.so.1 (0x00007f4eef668000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f4eef29a000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f4f3c876000)
        libamd_comgr.so.2 => /opt/rocm/lib/libamd_comgr.so.2 (0x00007f4ee7bc4000)
        libhsa-runtime64.so.1 => /opt/rocm/lib/libhsa-runtime64.so.1 (0x00007f4ee776f000)
        libnuma.so.1 => /lib64/libnuma.so.1 (0x00007f4ee7563000)
        libfabric.so.1 => /nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/compilers_and_libraries_2020.2.254/linux/mpi/intel64/libfabric/lib/libfabric.so.1 (0x00007f4ee7321000)
        libz.so.1 => /lib64/libz.so.1 (0x00007f4ee710b000)
        libtinfo.so.5 => /lib64/libtinfo.so.5 (0x00007f4ee6ee1000)
        libelf.so.1 => /lib64/libelf.so.1 (0x00007f4ee6cc9000)
        libdrm.so.2 => /opt/amdgpu/lib64/libdrm.so.2 (0x00007f4f3ca14000)
        libdrm_amdgpu.so.1 => /opt/amdgpu/lib64/libdrm_amdgpu.so.1 (0x00007f4f3ca07000)

KadirAkbudak avatar Sep 28 '22 12:09 KadirAkbudak

@KadirAkbudak , Thanks for the output, could you please provide me the output of the following commands:

ls -al /opt 
ls -al /opt/rocm/rocblas/library/
ls -al /etc/alternatives/rocm
rocm_agent_enumerator

@G-Ragghianti , By default rocBLAS generates architecture specific TensileLibrary files, but users can override this by using merge-architectures build option to generate TensileLibrary.dat file.

Here, I am trying to understand why the library failed to load/find TensileLibrary_gfx90a.dat file.

rkamd avatar Sep 28 '22 15:09 rkamd

$ ls -al /opt
total 16
drwxr-xr-x.  7 root root  128 Jun 28 00:58 .
dr-xr-xr-x. 21 root root 4096 Sep 26 02:07 ..
drwxr-xr-x   7 root root   65 Jul 11 17:06 amdgpu
drwx--x--x   4 root root   28 Jun  8 18:55 containerd
-rwxr-xr-x   1 root root  366 Jan 26  2021 knl_mods.sh
drwxr-xr-x.  4 root root   50 Aug 31  2020 nvidia
drwxr-xr-x.  4 root root   46 May 27  2020 rh
lrwxrwxrwx   1 root root   22 Jul 11 17:04 rocm -> /etc/alternatives/rocm
drwxr-xr-x  36 root root 4096 Jun 28 01:14 rocm-5.2.0
-rwxr-xr-x.  1 root root  276 Aug 31  2020 uncore.sh

$ ls -al /opt/rocm/rocblas/library/
ls: cannot access /opt/rocm/rocblas/library/: No such file or directory

$ ls -al /opt/rocm/rocblas/lib
total 0
drwxr-xr-x 3 root root  40 Jul 11 17:06 .
drwxr-xr-x 4 root root  32 Jul 11 17:06 ..
drwxr-xr-x 2 root root 136 Jul 11 17:06 cmake
lrwxrwxrwx 1 root root  23 Jul 11 17:06 librocblas.so -> ../../lib/librocblas.so

$ ls -al /etc/alternatives/rocm
lrwxrwxrwx 1 root root 15 Jul 11 17:04 /etc/alternatives/rocm -> /opt/rocm-5.2.0

$ rocm_agent_enumerator
gfx000
gfx90a
gfx90a

$ ls -al /opt/rocm/lib/rocblas/library/
total 1358704
drwxr-xr-x 2 root root      4096 Jul 11 17:06 .
drwxr-xr-x 3 root root        21 Jul 11 17:06 ..
-rw-r--r-- 1 root root  22368232 Jun 28 00:47 Kernels.so-000-gfx1010.hsaco
-rw-r--r-- 1 root root  21409768 Jun 28 00:47 Kernels.so-000-gfx1012.hsaco
-rw-r--r-- 1 root root  20856808 Jun 28 00:47 Kernels.so-000-gfx1030.hsaco
-rw-r--r-- 1 root root  21716968 Jun 28 00:47 Kernels.so-000-gfx803.hsaco
-rw-r--r-- 1 root root  22536168 Jun 28 00:47 Kernels.so-000-gfx900.hsaco
-rw-r--r-- 1 root root  20475880 Jun 28 00:47 Kernels.so-000-gfx906-xnack-.hsaco
-rw-r--r-- 1 root root  20463592 Jun 28 00:47 Kernels.so-000-gfx908-xnack-.hsaco
-rw-r--r-- 1 root root  20193256 Jun 28 00:47 Kernels.so-000-gfx90a-xnack-.hsaco
-rw-r--r-- 1 root root  20197352 Jun 28 00:47 Kernels.so-000-gfx90a-xnack+.hsaco
-rw-r--r-- 1 root root 130175968 Jun 28 00:47 TensileLibrary_gfx1030.co
-rw-r--r-- 1 root root  30403016 Jun 28 00:47 TensileLibrary_gfx1030.dat
-rw-r--r-- 1 root root   4321536 Jun 28 00:47 TensileLibrary_gfx803.co
-rw-r--r-- 1 root root   5517022 Jun 28 00:47 TensileLibrary_gfx803.dat
-rw-r--r-- 1 root root  53663704 Jun 28 00:47 TensileLibrary_gfx900.co
-rw-r--r-- 1 root root  23703192 Jun 28 00:47 TensileLibrary_gfx900.dat
-rw-r--r-- 1 root root 113151720 Jun 28 00:47 TensileLibrary_gfx906.co
-rw-r--r-- 1 root root  53902785 Jun 28 00:47 TensileLibrary_gfx906.dat
-rw-r--r-- 1 root root 238989720 Jun 28 00:47 TensileLibrary_gfx908.co
-rw-r--r-- 1 root root  67871494 Jun 28 00:47 TensileLibrary_gfx908.dat
-rw-r--r-- 1 root root 346532104 Jun 28 00:47 TensileLibrary_gfx90a.co
-rw-r--r-- 1 root root 132832830 Jun 28 00:46 TensileLibrary_gfx90a.dat
-rw-r--r-- 1 root root      2796 Jun 28 00:21 TensileManifest.txt

$ ls -al /opt/rocm/lib
total 4890972
drwxr-xr-x  7 root root      8192 Jul 11 17:07 .
drwxr-xr-x 36 root root      4096 Jun 28 01:14 ..
drwxr-xr-x 25 root root      4096 Jul 11 17:07 cmake
drwxr-xr-x  3 root root        27 Jul 11 17:04 CMakeFiles
-rw-r--r--  1 root root        92 Jun 27 23:23 .hipInfo
lrwxrwxrwx  1 root root        17 Jul 11 17:05 libamd_comgr.so -> libamd_comgr.so.2
lrwxrwxrwx  1 root root        25 Jul 11 17:05 libamd_comgr.so.2 -> libamd_comgr.so.2.4.50200
-rwxr-xr-x  1 root root 124241088 Jun 27 23:20 libamd_comgr.so.2.4.50200
lrwxrwxrwx  1 root root        16 Jul 11 17:06 libamdhip64.so -> libamdhip64.so.5
lrwxrwxrwx  1 root root        24 Jul 11 17:06 libamdhip64.so.5 -> libamdhip64.so.5.2.50200
-rwxr-xr-x  1 root root  13438688 Jun 27 23:35 libamdhip64.so.5.2.50200
-rwxr-xr-x  1 root root   1415632 Jun 27 23:26 libamdocl64.so
lrwxrwxrwx  1 root root        15 Jul 11 17:06 libhipblas.so -> libhipblas.so.0
lrwxrwxrwx  1 root root        23 Jul 11 17:06 libhipblas.so.0 -> libhipblas.so.0.1.50200
-rwxr-xr-x  1 root root    489944 Jun 28 00:58 libhipblas.so.0.1.50200
-rwxr-xr-x  1 root root     99640 Jun 28 00:16 libhipfft.so
lrwxrwxrwx  1 root root        15 Jul 11 17:07 libhiprand.so -> libhiprand.so.1
lrwxrwxrwx  1 root root        23 Jul 11 17:05 libhiprand.so.1 -> libhiprand.so.1.1.50200
-rwxr-xr-x  1 root root     16600 Jun 28 00:07 libhiprand.so.1.1.50200
lrwxrwxrwx  1 root root        23 Jul 11 17:06 libhiprtc-builtins.so -> libhiprtc-builtins.so.5
lrwxrwxrwx  1 root root        31 Jul 11 17:06 libhiprtc-builtins.so.5 -> libhiprtc-builtins.so.5.2.50200
-rwxr-xr-x  1 root root    371296 Jun 27 23:34 libhiprtc-builtins.so.5.2.50200
lrwxrwxrwx  1 root root        14 Jul 11 17:06 libhiprtc.so -> libhiprtc.so.5
lrwxrwxrwx  1 root root        22 Jul 11 17:06 libhiprtc.so.5 -> libhiprtc.so.5.2.50200
-rwxr-xr-x  1 root root    585320 Jun 27 23:35 libhiprtc.so.5.2.50200
lrwxrwxrwx  1 root root        17 Jul 11 17:06 libhipsolver.so -> libhipsolver.so.0
lrwxrwxrwx  1 root root        25 Jul 11 17:06 libhipsolver.so.0 -> libhipsolver.so.0.1.50200
-rwxr-xr-x  1 root root    221992 Jun 28 00:57 libhipsolver.so.0.1.50200
lrwxrwxrwx  1 root root        17 Jul 11 17:06 libhipsparse.so -> libhipsparse.so.0
lrwxrwxrwx  1 root root        25 Jul 11 17:06 libhipsparse.so.0 -> libhipsparse.so.0.1.50200
-rwxr-xr-x  1 root root    251216 Jun 28 00:27 libhipsparse.so.0.1.50200
lrwxrwxrwx  1 root root        28 Jul 11 17:07 libhsa-amd-aqlprofile64.so -> libhsa-amd-aqlprofile64.so.1
lrwxrwxrwx  1 root root        36 Jul 11 17:07 libhsa-amd-aqlprofile64.so.1 -> libhsa-amd-aqlprofile64.so.1.0.50200
-rwxr-xr-x  1 root root    330008 Jun 27 23:19 libhsa-amd-aqlprofile64.so.1.0.50200
lrwxrwxrwx  1 root root        21 Jul 11 17:06 libhsa-runtime64.so -> libhsa-runtime64.so.1
lrwxrwxrwx  1 root root        29 Jul 11 17:06 libhsa-runtime64.so.1 -> libhsa-runtime64.so.1.5.50200
-rwxr-xr-x  1 root root   2486088 Jun 27 23:18 libhsa-runtime64.so.1.5.50200
lrwxrwxrwx  1 root root        14 Jul 11 17:07 libMIOpen.so -> libMIOpen.so.1
lrwxrwxrwx  1 root root        22 Jul 11 17:07 libMIOpen.so.1 -> libMIOpen.so.1.0.50200
-rwxr-xr-x  1 root root 369673232 Jun 28 00:59 libMIOpen.so.1.0.50200
lrwxrwxrwx  1 root root        11 Jul 11 17:05 liboam.so -> liboam.so.1
lrwxrwxrwx  1 root root        19 Jul 11 17:05 liboam.so.1 -> liboam.so.1.0.50200
-rwxr-xr-x  1 root root    633584 Jun 27 22:38 liboam.so.1.0.50200
lrwxrwxrwx  1 root root        14 Jul 11 17:07 libOpenCL.so -> libOpenCL.so.1
lrwxrwxrwx  1 root root        16 Jul 11 17:07 libOpenCL.so.1 -> libOpenCL.so.1.2
-rwxr-xr-x  1 root root     32784 Jun 27 23:26 libOpenCL.so.1.2
lrwxrwxrwx  1 root root        12 Jul 11 17:06 librccl.so -> librccl.so.1
lrwxrwxrwx  1 root root        20 Jul 11 17:06 librccl.so.1 -> librccl.so.1.0.50200
-rwxr-xr-x  1 root root 118463128 Jun 28 00:14 librccl.so.1.0.50200
lrwxrwxrwx  1 root root        22 Jul 11 17:06 librocalution_hip.so -> librocalution_hip.so.0
lrwxrwxrwx  1 root root        30 Jul 11 17:06 librocalution_hip.so.0 -> librocalution_hip.so.0.1.50200
-rwxr-xr-x  1 root root  12613440 Jun 28 00:53 librocalution_hip.so.0.1.50200
lrwxrwxrwx  1 root root        18 Jul 11 17:06 librocalution.so -> librocalution.so.0
lrwxrwxrwx  1 root root        26 Jul 11 17:06 librocalution.so.0 -> librocalution.so.0.1.50200
-rwxr-xr-x  1 root root   9776016 Jun 28 00:53 librocalution.so.0.1.50200
lrwxrwxrwx  1 root root        15 Jul 11 17:06 librocblas.so -> librocblas.so.0
lrwxrwxrwx  1 root root        23 Jul 11 17:06 librocblas.so.0 -> librocblas.so.0.1.50200
-rwxr-xr-x  1 root root 231476848 Jun 28 00:47 librocblas.so.0.1.50200
lrwxrwxrwx  1 root root        23 Jul 11 17:07 librocfft-device-0.so -> librocfft-device-0.so.0
lrwxrwxrwx  1 root root        31 Jul 11 17:05 librocfft-device-0.so.0 -> librocfft-device-0.so.0.1.50200
-rwxr-xr-x  1 root root 725093408 Jun 28 00:08 librocfft-device-0.so.0.1.50200
lrwxrwxrwx  1 root root        23 Jul 11 17:07 librocfft-device-1.so -> librocfft-device-1.so.0
lrwxrwxrwx  1 root root        31 Jul 11 17:05 librocfft-device-1.so.0 -> librocfft-device-1.so.0.1.50200
-rwxr-xr-x  1 root root 756631464 Jun 28 00:08 librocfft-device-1.so.0.1.50200
lrwxrwxrwx  1 root root        23 Jul 11 17:07 librocfft-device-2.so -> librocfft-device-2.so.0
lrwxrwxrwx  1 root root        31 Jul 11 17:05 librocfft-device-2.so.0 -> librocfft-device-2.so.0.1.50200
-rwxr-xr-x  1 root root 738903360 Jun 28 00:08 librocfft-device-2.so.0.1.50200
lrwxrwxrwx  1 root root        23 Jul 11 17:07 librocfft-device-3.so -> librocfft-device-3.so.0
lrwxrwxrwx  1 root root        31 Jul 11 17:06 librocfft-device-3.so.0 -> librocfft-device-3.so.0.1.50200
-rwxr-xr-x  1 root root 637009768 Jun 28 00:08 librocfft-device-3.so.0.1.50200
lrwxrwxrwx  1 root root        14 Jul 11 17:07 librocfft.so -> librocfft.so.0
lrwxrwxrwx  1 root root        22 Jul 11 17:06 librocfft.so.0 -> librocfft.so.0.1.50200
-rwxr-xr-x  1 root root   5040944 Jun 28 00:08 librocfft.so.0.1.50200
lrwxrwxrwx  1 root root        17 Jul 11 17:04 librocm-core.so -> librocm-core.so.1
lrwxrwxrwx  1 root root        25 Jul 11 17:04 librocm-core.so.1 -> librocm-core.so.1.0.50200
-rwxr-xr-x  1 root root      7672 Jun 27 23:38 librocm-core.so.1.0.50200
lrwxrwxrwx  1 root root        19 Jul 11 17:05 librocm-dbgapi.so -> librocm-dbgapi.so.0
lrwxrwxrwx  1 root root        24 Jul 11 17:05 librocm-dbgapi.so.0 -> librocm-dbgapi.so.0.65.1
-rwxr-xr-x  1 root root    989760 Jun 27 23:27 librocm-dbgapi.so.0.65.1
lrwxrwxrwx  1 root root        28 Jul 11 17:07 librocm-debug-agent.so.2 -> librocm-debug-agent.so.2.0.3
-rwxr-xr-x  1 root root    186696 Jun 27 23:35 librocm-debug-agent.so.2.0.3
lrwxrwxrwx  1 root root        18 Jul 11 17:05 librocm_smi64.so -> librocm_smi64.so.5
lrwxrwxrwx  1 root root        26 Jul 11 17:05 librocm_smi64.so.5 -> librocm_smi64.so.5.0.50200
-rwxr-xr-x  1 root root    613016 Jun 27 22:38 librocm_smi64.so.5.0.50200
lrwxrwxrwx  1 root root        21 Jul 11 17:07 librocprofiler64.so -> librocprofiler64.so.1
lrwxrwxrwx  1 root root        29 Jul 11 17:07 librocprofiler64.so.1 -> librocprofiler64.so.1.0.50200
-rwxr-xr-x  1 root root    343976 Jun 27 23:19 librocprofiler64.so.1.0.50200
lrwxrwxrwx  1 root root        15 Jul 11 17:07 librocrand.so -> librocrand.so.1
lrwxrwxrwx  1 root root        23 Jul 11 17:05 librocrand.so.1 -> librocrand.so.1.1.50200
-rwxr-xr-x  1 root root  17738608 Jun 28 00:07 librocrand.so.1.1.50200
lrwxrwxrwx  1 root root        17 Jul 11 17:07 librocsolver.so -> librocsolver.so.0
lrwxrwxrwx  1 root root        25 Jul 11 17:05 librocsolver.so.0 -> librocsolver.so.0.1.50200
-rwxr-xr-x  1 root root 863176736 Jun 28 00:53 librocsolver.so.0.1.50200
lrwxrwxrwx  1 root root        17 Jul 11 17:06 librocsparse.so -> librocsparse.so.0
lrwxrwxrwx  1 root root        25 Jul 11 17:06 librocsparse.so.0 -> librocsparse.so.0.1.50200
-rwxr-xr-x  1 root root 375513600 Jun 28 00:11 librocsparse.so.0.1.50200
lrwxrwxrwx  1 root root        19 Jul 11 17:07 libroctracer64.so -> libroctracer64.so.1
lrwxrwxrwx  1 root root        27 Jul 11 17:07 libroctracer64.so.1 -> libroctracer64.so.1.0.50200
-rwxr-xr-x  1 root root    318536 Jun 27 23:36 libroctracer64.so.1.0.50200
lrwxrwxrwx  1 root root        15 Jul 11 17:07 libroctx64.so -> libroctx64.so.1
lrwxrwxrwx  1 root root        23 Jul 11 17:07 libroctx64.so.1 -> libroctx64.so.1.0.50200
-rwxr-xr-x  1 root root     82216 Jun 27 23:36 libroctx64.so.1.0.50200
drwxr-xr-x  3 root root        21 Jul 11 17:06 rocblas
-rw-r--r--  1 root root       462 Jun 27 23:38 rocmmod
drwxr-xr-x  2 root root        94 Jul 11 17:07 rocprofiler
drwxr-xr-x  2 root root        34 Jul 11 17:07 roctracer

KadirAkbudak avatar Sep 28 '22 16:09 KadirAkbudak

Yes, thanks for the clarification. Kadir also had produced some output from strace which showed the files it was attempting to open. Maybe this would also be useful for you?

@KadirAkbudak , Thanks for the output, could you please provide me the output of the following commands:

ls -al /opt 
ls -al /opt/rocm/rocblas/library/
ls -al /etc/alternatives/rocm
rocm_agent_enumerator

@G-Ragghianti , By default rocBLAS generates architecture specific TensileLibrary files, but users can override this by using merge-architectures build option to generate TensileLibrary.dat file.

Here, I am trying to understand why the library failed to load/find TensileLibrary_gfx90a.dat file.

G-Ragghianti avatar Sep 28 '22 16:09 G-Ragghianti

@KadirAkbudak, Thanks for the information, Can you confirm if ROCBLAS_TENSILE_LIBPATH environment variable is set to some path?

echo $ROCBLAS_TENSILE_LIBPATH 

rkamd avatar Sep 28 '22 18:09 rkamd

ROCBLAS_TENSILE_LIBPATH is unset in our environment.

Will I set it to /opt/rocm-5.2.0/lib/rocblas/library/ ?

KadirAkbudak avatar Sep 29 '22 14:09 KadirAkbudak

@KadirAkbudak , Setting the ROCBLAS_TENSILE_LIBPATH might fix the issue, but it should have worked even otherwise.

  • Is the LD_LIBRARY_PATH set? Also, provide me the output of readelf app -d | grep path
  • If possible, please attach output of strace.

rkamd avatar Sep 29 '22 22:09 rkamd

My initial test with export ROCBLAS_TENSILE_LIBPATH=/opt/rocm/lib/rocblas/library/ is not giving the error. This is good news. I will repeat the test several times to make sure the error does not appear anymore.

$ echo $LD_LIBRARY_PATH
/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/compilers_and_libraries_2020.2.254/linux/mpi/intel64/libfabric/lib:/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/compilers_and_libraries_2020.2.254/linux/mpi/intel64/lib/release:/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/compilers_and_libraries_2020.2.254/linux/mpi/intel64/lib:/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-oneapi-mkl-2022.0.2-csli2oqmbu6esrsvxrnenbde52w7touz/mkl/2022.0.2/lib/intel64:/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-oneapi-mkl-2022.0.2-csli2oqmbu6esrsvxrnenbde52w7touz/lib:/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/gcc-7.3.0-z3dl67mmil3qe2z5loqk6l5denetg5x7/lib64:/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/gcc-7.3.0-z3dl67mmil3qe2z5loqk6l5denetg5x7/lib:/opt/rocm/lib:/opt/rocm/lib64
$ readelf test/tester -d | grep path
 0x000000000000001d (RUNPATH)            Library runpath: [/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/gcc-8.3.0-uv3lobm7o75mkoallcbarw6lllpw222e/lib:/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/gcc-8.3.0-uv3lobm7o75mkoallcbarw6lllpw222e/lib64:/home/kadir/t022/slate-dev/lib:/home/kadir/t022/slate-dev/testsweeper:/home/kadir/t022/slate-dev/blaspp/lib:/home/kadir/t022/slate-dev/lapackpp/lib:/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-oneapi-mpi-2021.1.1-aiygi2vu5ocm3mx7zpyvsbbai6kzoet2/mpi/2021.1.1/lib/release:/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-oneapi-mpi-2021.1.1-aiygi2vu5ocm3mx7zpyvsbbai6kzoet2/mpi/2021.1.1/lib]

Strace output for a single SLATE routine.

KadirAkbudak avatar Sep 30 '22 13:09 KadirAkbudak

Thanks for information. Is the strace output with the ROCBLAS_TENSILE_LIBPATH environment variable set? If yes, could you please re-run strace without setting the variable?

I was expecting to see logs like below, but in the attached strace output, I do not see any log related to TensileLibrary_gfx90a.dat

access("/opt/rocm-5.2.2/lib/../../Tensile/library", R_OK) = -1 ENOENT (No such file or directory)
access("/opt/rocm-5.2.2/liblibrary", R_OK) = -1 ENOENT (No such file or directory)
access("/opt/rocm-5.2.2/lib/rocblas/library/gfx90a", R_OK) = -1 ENOENT (No such file or directory)
access("/opt/rocm-5.2.2/lib/rocblas/library/TensileLibrary_gfx90a.dat", R_OK) = 0
mmap(NULL, 8392704, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7fcdeddff000
mprotect(0x7fcdede00000, 8388608, PROT_READ|PROT_WRITE|PROT_EXEC) = 0

rkamd avatar Sep 30 '22 14:09 rkamd

@KadirAkbudak yes please drop the -e trace=open or use "access" and reattach a strace log of a run which fails which @rkamd requested. We are assuming this one was successful because you set the env variable ROCBLAS_TENSILE_LIBPATH ? That log appears to show it finding code objects, e.g. open("/opt/rocm-5.2.0/lib/rocblas/library//TensileLibrary_gfx90a.co", O_RDONLY) = 20 but we want to see the path of access failures when it looks for the .dat file? We need to find out why it fails as we don't want you setting env ROCBLAS_TENSILE_LIBPATH as a solution. thanks

TorreZuk avatar Sep 30 '22 15:09 TorreZuk

The file that I sent was run withOUT setting ROCBLAS_TENSILE_LIBPATH.

The following output is also withOUT setting ROCBLAS_TENSILE_LIBPATH:

strace -e trace=access ./tester   --origin s --target t,d --ref n --nb 8 --type s,d,c,z --lookahead 1 --transA n,t,c --transB n,t,c --dim 100 --dim 100x50 --dim 50x100 --dim 25x50x75 gemm
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
access("/etc/sysconfig/strcasecmp-nonascii", F_OK) = -1 ENOENT (No such file or directory)
access("/etc/sysconfig/strcasecmp-nonascii", F_OK) = -1 ENOENT (No such file or directory)
access("/etc/sysconfig/strcasecmp-nonascii", F_OK) = -1 ENOENT (No such file or directory)
access("/etc/sysconfig/strcasecmp-nonascii", F_OK) = -1 ENOENT (No such file or directory)
access("/etc/sysconfig/strcasecmp-nonascii", F_OK) = -1 ENOENT (No such file or directory)
access("/etc/sysconfig/strcasecmp-nonascii", F_OK) = -1 ENOENT (No such file or directory)
access("/etc/sysconfig/strcasecmp-nonascii", F_OK) = -1 ENOENT (No such file or directory)
access("/etc/sysconfig/strcasecmp-nonascii", F_OK) = -1 ENOENT (No such file or directory)
access("/etc/sysconfig/strcasecmp-nonascii", F_OK) = -1 ENOENT (No such file or directory)
access("/etc/sysconfig/strcasecmp-nonascii", F_OK) = -1 ENOENT (No such file or directory)
access("/etc/sysconfig/strcasecmp-nonascii", F_OK) = -1 ENOENT (No such file or directory)
access("/etc/sysconfig/strcasecmp-nonascii", F_OK) = -1 ENOENT (No such file or directory)
access("/etc/sysconfig/strcasecmp-nonascii", F_OK) = -1 ENOENT (No such file or directory)
access("/etc/sysconfig/strcasecmp-nonascii", F_OK) = -1 ENOENT (No such file or directory)
access("/etc/sysconfig/strcasecmp-nonascii", F_OK) = -1 ENOENT (No such file or directory)
access("/etc/sysconfig/strcasecmp-nonascii", F_OK) = -1 ENOENT (No such file or directory)
access("/etc/sysconfig/strcasecmp-nonascii", F_OK) = -1 ENOENT (No such file or directory)
access("/etc/sysconfig/strcasecmp-nonascii", F_OK) = -1 ENOENT (No such file or directory)
access("/etc/sysconfig/strcasecmp-nonascii", F_OK) = -1 ENOENT (No such file or directory)
access("/etc/sysconfig/strcasecmp-nonascii", F_OK) = -1 ENOENT (No such file or directory)
SLATE version 2022.07.00, id cf8095c
input: ./tester --origin s --target t,d --ref n --nb 8 --type s,d,c,z --lookahead 1 --transA n,t,c --transB n,t,c --dim 100 --dim 100x50 --dim 50x100 --dim 25x50x75 gemm
2022-10-03 08:37:24, MPI size 1, OpenMP threads 8, GPU devices available 2
                                                                                                                                                                                             
type  origin  target  gemm   go   transA   transB       m       n       k      alpha       beta    nb    p    q  la      error   time (s)       gflop/s  ref time (s)   ref gflop/s  status  
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/compilers_and_libraries_2020.2.254/linux/mpi/intel64/libfabric/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/compilers_and_libraries_2020.2.254/linux/mpi/intel64/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-oneapi-mkl-2022.0.2-csli2oqmbu6esrsvxrnenbde52w7touz/mkl/2022.0.2/bin/intel64/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/gcc-7.3.0-z3dl67mmil3qe2z5loqk6l5denetg5x7/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/python-3.9.10-sfmfcn7qbsq4yxzd7donakza3jcpoqmf/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/current/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/environment-modules-4.6.1-oj2biyrcm3kij2ddfnxbm52opqw6n7by/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/usr/local/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/usr/bin/ptxas", R_OK|X_OK)     = -1 ENOENT (No such file or directory)
access("/usr/sbin/ptxas", R_OK|X_OK)    = -1 ENOENT (No such file or directory)
access("/opt/rocm/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/proc/self/fd", R_OK)           = 0
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/compilers_and_libraries_2020.2.254/linux/mpi/intel64/libfabric/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/compilers_and_libraries_2020.2.254/linux/mpi/intel64/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-oneapi-mkl-2022.0.2-csli2oqmbu6esrsvxrnenbde52w7touz/mkl/2022.0.2/bin/intel64/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/gcc-7.3.0-z3dl67mmil3qe2z5loqk6l5denetg5x7/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/python-3.9.10-sfmfcn7qbsq4yxzd7donakza3jcpoqmf/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/current/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/environment-modules-4.6.1-oj2biyrcm3kij2ddfnxbm52opqw6n7by/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/usr/local/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/usr/bin/ptxas", R_OK|X_OK)     = -1 ENOENT (No such file or directory)
access("/usr/sbin/ptxas", R_OK|X_OK)    = -1 ENOENT (No such file or directory)
access("/opt/rocm/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/compilers_and_libraries_2020.2.254/linux/mpi/intel64/libfabric/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/compilers_and_libraries_2020.2.254/linux/mpi/intel64/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-oneapi-mkl-2022.0.2-csli2oqmbu6esrsvxrnenbde52w7touz/mkl/2022.0.2/bin/intel64/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/gcc-7.3.0-z3dl67mmil3qe2z5loqk6l5denetg5x7/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/python-3.9.10-sfmfcn7qbsq4yxzd7donakza3jcpoqmf/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/current/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/environment-modules-4.6.1-oj2biyrcm3kij2ddfnxbm52opqw6n7by/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/usr/local/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/usr/bin/ptxas", R_OK|X_OK)     = -1 ENOENT (No such file or directory)
access("/usr/sbin/ptxas", R_OK|X_OK)    = -1 ENOENT (No such file or directory)
access("/opt/rocm/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("amdgcn-amd-amdhsa-ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/compilers_and_libraries_2020.2.254/linux/mpi/intel64/libfabric/bin/amdgcn-amd-amdhsa-ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/compilers_and_libraries_2020.2.254/linux/mpi/intel64/bin/amdgcn-amd-amdhsa-ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/bin/amdgcn-amd-amdhsa-ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-oneapi-mkl-2022.0.2-csli2oqmbu6esrsvxrnenbde52w7touz/mkl/2022.0.2/bin/intel64/amdgcn-amd-amdhsa-ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/gcc-7.3.0-z3dl67mmil3qe2z5loqk6l5denetg5x7/bin/amdgcn-amd-amdhsa-ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/python-3.9.10-sfmfcn7qbsq4yxzd7donakza3jcpoqmf/bin/amdgcn-amd-amdhsa-ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/current/bin/amdgcn-amd-amdhsa-ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/environment-modules-4.6.1-oj2biyrcm3kij2ddfnxbm52opqw6n7by/bin/amdgcn-amd-amdhsa-ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/usr/local/bin/amdgcn-amd-amdhsa-ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/usr/bin/amdgcn-amd-amdhsa-ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/usr/sbin/amdgcn-amd-amdhsa-ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/opt/rocm/bin/amdgcn-amd-amdhsa-ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("ld.lld", R_OK|X_OK)             = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/compilers_and_libraries_2020.2.254/linux/mpi/intel64/libfabric/bin/ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/compilers_and_libraries_2020.2.254/linux/mpi/intel64/bin/ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/bin/ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-oneapi-mkl-2022.0.2-csli2oqmbu6esrsvxrnenbde52w7touz/mkl/2022.0.2/bin/intel64/ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/gcc-7.3.0-z3dl67mmil3qe2z5loqk6l5denetg5x7/bin/ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/python-3.9.10-sfmfcn7qbsq4yxzd7donakza3jcpoqmf/bin/ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/current/bin/ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/environment-modules-4.6.1-oj2biyrcm3kij2ddfnxbm52opqw6n7by/bin/ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/usr/local/bin/ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/usr/bin/ld.lld", R_OK|X_OK)    = -1 ENOENT (No such file or directory)
access("/usr/sbin/ld.lld", R_OK|X_OK)   = -1 ENOENT (No such file or directory)
access("/opt/rocm/bin/ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/tmp/comgr-196675/output/a.so", F_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/compilers_and_libraries_2020.2.254/linux/mpi/intel64/libfabric/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/compilers_and_libraries_2020.2.254/linux/mpi/intel64/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-oneapi-mkl-2022.0.2-csli2oqmbu6esrsvxrnenbde52w7touz/mkl/2022.0.2/bin/intel64/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/gcc-7.3.0-z3dl67mmil3qe2z5loqk6l5denetg5x7/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/python-3.9.10-sfmfcn7qbsq4yxzd7donakza3jcpoqmf/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/current/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/environment-modules-4.6.1-oj2biyrcm3kij2ddfnxbm52opqw6n7by/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/usr/local/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/usr/bin/ptxas", R_OK|X_OK)     = -1 ENOENT (No such file or directory)
access("/usr/sbin/ptxas", R_OK|X_OK)    = -1 ENOENT (No such file or directory)
access("/opt/rocm/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/compilers_and_libraries_2020.2.254/linux/mpi/intel64/libfabric/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/compilers_and_libraries_2020.2.254/linux/mpi/intel64/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-oneapi-mkl-2022.0.2-csli2oqmbu6esrsvxrnenbde52w7touz/mkl/2022.0.2/bin/intel64/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/gcc-7.3.0-z3dl67mmil3qe2z5loqk6l5denetg5x7/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/python-3.9.10-sfmfcn7qbsq4yxzd7donakza3jcpoqmf/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/current/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/environment-modules-4.6.1-oj2biyrcm3kij2ddfnxbm52opqw6n7by/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/usr/local/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/usr/bin/ptxas", R_OK|X_OK)     = -1 ENOENT (No such file or directory)
access("/usr/sbin/ptxas", R_OK|X_OK)    = -1 ENOENT (No such file or directory)
access("/opt/rocm/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/compilers_and_libraries_2020.2.254/linux/mpi/intel64/libfabric/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/compilers_and_libraries_2020.2.254/linux/mpi/intel64/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-oneapi-mkl-2022.0.2-csli2oqmbu6esrsvxrnenbde52w7touz/mkl/2022.0.2/bin/intel64/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/gcc-7.3.0-z3dl67mmil3qe2z5loqk6l5denetg5x7/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/python-3.9.10-sfmfcn7qbsq4yxzd7donakza3jcpoqmf/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/current/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/environment-modules-4.6.1-oj2biyrcm3kij2ddfnxbm52opqw6n7by/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/usr/local/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/usr/bin/ptxas", R_OK|X_OK)     = -1 ENOENT (No such file or directory)
access("/usr/sbin/ptxas", R_OK|X_OK)    = -1 ENOENT (No such file or directory)
access("/opt/rocm/bin/ptxas", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("amdgcn-amd-amdhsa-ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/compilers_and_libraries_2020.2.254/linux/mpi/intel64/libfabric/bin/amdgcn-amd-amdhsa-ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/compilers_and_libraries_2020.2.254/linux/mpi/intel64/bin/amdgcn-amd-amdhsa-ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/bin/amdgcn-amd-amdhsa-ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-oneapi-mkl-2022.0.2-csli2oqmbu6esrsvxrnenbde52w7touz/mkl/2022.0.2/bin/intel64/amdgcn-amd-amdhsa-ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/gcc-7.3.0-z3dl67mmil3qe2z5loqk6l5denetg5x7/bin/amdgcn-amd-amdhsa-ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/python-3.9.10-sfmfcn7qbsq4yxzd7donakza3jcpoqmf/bin/amdgcn-amd-amdhsa-ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/current/bin/amdgcn-amd-amdhsa-ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/environment-modules-4.6.1-oj2biyrcm3kij2ddfnxbm52opqw6n7by/bin/amdgcn-amd-amdhsa-ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/usr/local/bin/amdgcn-amd-amdhsa-ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/usr/bin/amdgcn-amd-amdhsa-ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/usr/sbin/amdgcn-amd-amdhsa-ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/opt/rocm/bin/amdgcn-amd-amdhsa-ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("ld.lld", R_OK|X_OK)             = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/compilers_and_libraries_2020.2.254/linux/mpi/intel64/libfabric/bin/ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/compilers_and_libraries_2020.2.254/linux/mpi/intel64/bin/ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-mpi-2019.8.254-mszg6y4geg4wip2awwug3f4v5ztshevf/bin/ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/intel-oneapi-mkl-2022.0.2-csli2oqmbu6esrsvxrnenbde52w7touz/mkl/2022.0.2/bin/intel64/ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/gcc-7.3.0-z3dl67mmil3qe2z5loqk6l5denetg5x7/bin/ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/python-3.9.10-sfmfcn7qbsq4yxzd7donakza3jcpoqmf/bin/ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/current/bin/ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/nfs/apps/spacks/2022-02-10/opt/spack/linux-centos7-x86_64/gcc-7.3.0/environment-modules-4.6.1-oj2biyrcm3kij2ddfnxbm52opqw6n7by/bin/ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/usr/local/bin/ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/usr/bin/ld.lld", R_OK|X_OK)    = -1 ENOENT (No such file or directory)
access("/usr/sbin/ld.lld", R_OK|X_OK)   = -1 ENOENT (No such file or directory)
access("/opt/rocm/bin/ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
access("/tmp/comgr-88a1ed/output/a.so", F_OK) = -1 ENOENT (No such file or directory)
   s  scalpk    task  auto  col  notrans  notrans     100     100     100   3.1+1.4i   2.7+1.7i     8    1    1   1   1.65e-07     0.0152         0.131            NA            NA  pass    
   s  scalpk    task  auto  col  notrans  notrans     100      50      50   3.1+1.4i   2.7+1.7i     8    1    1   1   1.21e-07    0.00629        0.0794            NA            NA  pass    
   s  scalpk    task  auto  col  notrans  notrans      50     100     100   3.1+1.4i   2.7+1.7i     8    1    1   1   1.64e-07     0.0185        0.0542            NA            NA  pass    
   s  scalpk    task  auto  col  notrans  notrans      25      50      75   3.1+1.4i   2.7+1.7i     8    1    1   1   1.47e-07    0.00428        0.0438            NA            NA  pass    
   s  scalpk    task  auto  col  notrans    trans     100     100     100   3.1+1.4i   2.7+1.7i     8    1    1   1   1.83e-07     0.0222        0.0899            NA            NA  pass    
   s  scalpk    task  auto  col  notrans    trans     100      50      50   3.1+1.4i   2.7+1.7i     8    1    1   1   1.30e-07    0.00762        0.0656            NA            NA  pass    
   s  scalpk    task  auto  col  notrans    trans      50     100     100   3.1+1.4i   2.7+1.7i     8    1    1   1   1.82e-07     0.0160        0.0624            NA            NA  pass    
   s  scalpk    task  auto  col  notrans    trans      25      50      75   3.1+1.4i   2.7+1.7i     8    1    1   1   1.56e-07    0.00479        0.0391            NA            NA  pass    
   s  scalpk    task  auto  col  notrans     conj     100     100     100   3.1+1.4i   2.7+1.7i     8    1    1   1   1.83e-07     0.0151         0.133            NA            NA  pass    
   s  scalpk    task  auto  col  notrans     conj     100      50      50   3.1+1.4i   2.7+1.7i     8    1    1   1   1.25e-07    0.00449         0.111            NA            NA  pass    
   s  scalpk    task  auto  col  notrans     conj      50     100     100   3.1+1.4i   2.7+1.7i     8    1    1   1   1.80e-07     0.0113        0.0883            NA            NA  pass    
   s  scalpk    task  auto  col  notrans     conj      25      50      75   3.1+1.4i   2.7+1.7i     8    1    1   1   1.38e-07    0.00409        0.0458            NA            NA  pass    
   s  scalpk    task  auto  col    trans  notrans     100     100     100   3.1+1.4i   2.7+1.7i     8    1    1   1   1.23e-07     0.0171         0.117            NA            NA  pass    
   s  scalpk    task  auto  col    trans  notrans     100      50      50   3.1+1.4i   2.7+1.7i     8    1    1   1   9.15e-08    0.00889        0.0562            NA            NA  pass    
   s  scalpk    task  auto  col    trans  notrans      50     100     100   3.1+1.4i   2.7+1.7i     8    1    1   1   1.25e-07     0.0118        0.0845            NA            NA  pass    
   s  scalpk    task  auto  col    trans  notrans      25      50      75   3.1+1.4i   2.7+1.7i     8    1    1   1   1.09e-07    0.00319        0.0587            NA            NA  pass    
   s  scalpk    task  auto  col    trans    trans     100     100     100   3.1+1.4i   2.7+1.7i     8    1    1   1   1.26e-07     0.0203        0.0983            NA            NA  pass    
   s  scalpk    task  auto  col    trans    trans     100      50      50   3.1+1.4i   2.7+1.7i     8    1    1   1   9.55e-08    0.00982        0.0509            NA            NA  pass    
   s  scalpk    task  auto  col    trans    trans      50     100     100   3.1+1.4i   2.7+1.7i     8    1    1   1   1.47e-07     0.0120        0.0834            NA            NA  pass    
   s  scalpk    task  auto  col    trans    trans      25      50      75   3.1+1.4i   2.7+1.7i     8    1    1   1   1.15e-07    0.00390        0.0481            NA            NA  pass    
   s  scalpk    task  auto  col    trans     conj     100     100     100   3.1+1.4i   2.7+1.7i     8    1    1   1   1.33e-07     0.0153         0.131            NA            NA  pass    
   s  scalpk    task  auto  col    trans     conj     100      50      50   3.1+1.4i   2.7+1.7i     8    1    1   1   1.02e-07    0.00982        0.0509            NA            NA  pass    
   s  scalpk    task  auto  col    trans     conj      50     100     100   3.1+1.4i   2.7+1.7i     8    1    1   1   1.31e-07     0.0116        0.0859            NA            NA  pass    
   s  scalpk    task  auto  col    trans     conj      25      50      75   3.1+1.4i   2.7+1.7i     8    1    1   1   9.30e-08    0.00431        0.0435            NA            NA  pass    
   s  scalpk    task  auto  col     conj  notrans     100     100     100   3.1+1.4i   2.7+1.7i     8    1    1   1   1.21e-07     0.0147         0.136            NA            NA  pass    
   s  scalpk    task  auto  col     conj  notrans     100      50      50   3.1+1.4i   2.7+1.7i     8    1    1   1   1.02e-07    0.00494         0.101            NA            NA  pass    
   s  scalpk    task  auto  col     conj  notrans      50     100     100   3.1+1.4i   2.7+1.7i     8    1    1   1   1.30e-07    0.00875         0.114            NA            NA  pass    
   s  scalpk    task  auto  col     conj  notrans      25      50      75   3.1+1.4i   2.7+1.7i     8    1    1   1   8.38e-08    0.00492        0.0381            NA            NA  pass    
   s  scalpk    task  auto  col     conj    trans     100     100     100   3.1+1.4i   2.7+1.7i     8    1    1   1   1.14e-07     0.0182         0.110            NA            NA  pass    
   s  scalpk    task  auto  col     conj    trans     100      50      50   3.1+1.4i   2.7+1.7i     8    1    1   1   9.35e-08    0.00914        0.0547            NA            NA  pass    
   s  scalpk    task  auto  col     conj    trans      50     100     100   3.1+1.4i   2.7+1.7i     8    1    1   1   1.62e-07    0.00328         0.305            NA            NA  pass    
   s  scalpk    task  auto  col     conj    trans      25      50      75   3.1+1.4i   2.7+1.7i     8    1    1   1   1.02e-07    0.00256        0.0733            NA            NA  pass    
   s  scalpk    task  auto  col     conj     conj     100     100     100   3.1+1.4i   2.7+1.7i     8    1    1   1   1.26e-07     0.0195         0.103            NA            NA  pass    
   s  scalpk    task  auto  col     conj     conj     100      50      50   3.1+1.4i   2.7+1.7i     8    1    1   1   9.02e-08     0.0115        0.0436            NA            NA  pass    
   s  scalpk    task  auto  col     conj     conj      50     100     100   3.1+1.4i   2.7+1.7i     8    1    1   1   1.36e-07     0.0106        0.0939            NA            NA  pass    
   s  scalpk    task  auto  col     conj     conj      25      50      75   3.1+1.4i   2.7+1.7i     8    1    1   1   9.10e-08    0.00398        0.0472            NA            NA  pass    

rocBLAS error: Cannot read /opt/rocm/rocblas/library/TensileLibrary.dat: Illegal seek
+++ killed by SIGABRT (core dumped) +++
FAILED : exit code -6

KadirAkbudak avatar Oct 03 '22 13:10 KadirAkbudak

Summary:

  • Unable to reproduce the error on any local environment
  • Code review and debug logs provided by @KadirAkbudak did not indicate any coding error.
  • Provided an environment variable for the user to bypass this issue.

Closing this ticket assuming it is resolved or not relevant anymore. Feel free to open a new issue, if required.

rkamd avatar Mar 16 '23 15:03 rkamd

Please re-open this ticket.

I hit the exact same issue when trying to use tensorflow-rocm 2.11 on the tutorial code at https://www.tensorflow.org/text/tutorials/text_generation. I have a MI100 GPU, running on Ubuntu 22.04.2 LTS, rocm 5.4.3 . I installed all the packages using amdgpu-install with the usecases rocm and mllib, then complemented by installing the packages miopen-hip-gfx908-120kdb and miopenkernels-gfx908-120kdb .

export ROCBLAS_TENSILE_LIBPATH=/opt/rocm/lib/rocblas/library/ fixes that specific problems, but it should work out of the box. (And I hit some other issue right after...)

Epliz avatar Mar 25 '23 09:03 Epliz

@Epliz , Thanks for reporting the issue. Could you please open a new ticket with logs without the env variable set.

rkamd avatar Mar 27 '23 18:03 rkamd

What kind of logs would help you? Strace?

Epliz avatar Mar 27 '23 21:03 Epliz

Yes, that would help

rkamd avatar Mar 27 '23 21:03 rkamd