Tensile
Tensile copied to clipboard
build fails with rocm4.1
I just tried building using published instructin on github but appears to fail. Installed rocm4.1 and checked out 4.1 branch of tensile. Develop branch also failing with same error.
root@guest:~/indie/Tensile/build# git branch -r origin/2.6-hip-flags origin/HEAD -> origin/develop origin/aditylad/test1 origin/amd-feature-targetid origin/arct-wip ... origin/master-rocm-3.5 origin/mfma origin/mfma_opt origin/miot origin/miot-pack origin/miot-staging origin/msgpack-backend root@guest:~/indie/Tensile/build# git branch -r | grep rocm origin/develop-rocm20 origin/master-rocm-2.1 origin/master-rocm-2.10 ... origin/rocm-3.9.x origin/rocm-4.1.x origin/rocm-4.2.x origin/rocm-nm-3493 root@guest:~/indie/Tensile/build# git checkout rocm-4.1.x Branch 'rocm-4.1.x' set up to track remote branch 'rocm-4.1.x' from 'origin'. Switched to a new branch 'rocm-4.1.x' root@guest:~/indie/Tensile/build# ../Tensile/bin/Tensile ../Tensile/Configs/rocblas_sgemm_asm_only.yaml ./ Message pack python library not detected. Must use YAML backend instead.
################################################################################
Tensile v4.26.0
Config: /root/indie/Tensile/Tensile/Configs/rocblas_sgemm_asm_only.yaml
################################################################################
Restoring default globalParameters
Detected local GPU with ISA: gfx900
Traceback (most recent call last):
File "../Tensile/bin/Tensile", line 36, in filename
.
TypeError: expected str, bytes or os.PathLike object, not NoneType
root@guest:~/indie/Tensile/build# modprobe amdgpu
root@guest:~/indie/Tensile/build# ../Tensile/bin/Tensile ../Tensile/Configs/rocblas_sgemm_asm_only.yaml ./^C
root@guest:~/indie/Tensile/build# ls -l /opt
total 12
drwxr-xr-x 7 root root 4096 Apr 2 16:19 amdgpu
drwxr-xr-x 5 root root 4096 Dec 24 11:01 amdgpu-pro
lrwxrwxrwx 1 root root 22 Apr 2 16:23 rocm -> /etc/alternatives/rocm
drwxr-xr-x 18 root root 4096 Apr 2 16:19 rocm-4.1.0
root@guest:~/indie/Tensile/build# git branch
develop
- rocm-4.1.x
I looked closely again, it appears i set the path incorrectly. I can build now, so this can be closed.
I can not build with rocm-4.2 followed the instructions in https://github.com/ROCmSoftwarePlatform/Tensile/wiki. See build error log below:
Welcome to Ubuntu 18.04.5 LTS (GNU/Linux 5.4.38+ x86_64)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
93 updates can be applied immediately.
68 of these updates are standard security updates.
To see these additional updates run: apt list --upgradable
Your Hardware Enablement Stack (HWE) is supported until April 2023.
Last login: Tue Aug 10 00:26:54 2021 from 192.168.122.1
root@sriov-guest:/git.co/dev-learn/ml-ubuntu/tensorflow/tflow-2nded# ls -l ~
total 78356
lrwxrwxrwx 1 root root 14 Jun 16 10:14 ROCm -> /root/ROCm-4.2
drwxr-xr-x 50 root root 4096 Jun 16 10:12 ROCm-4.2
drwxr-xr-x 2 root root 4096 Jun 16 04:41 bin
-rw-r--r-- 1 root root 80213604 Jul 31 05:07 google-chrome-stable_current_amd64.deb
-rwxr-xr-x 1 root root 854 Jun 16 04:49 rocm-source.sh
drwxr-xr-x 5 root root 4096 Aug 6 14:58 rocprofiler_py
drwxr-xr-x 6 root root 4096 Jun 16 04:38 snap
root@sriov-guest:/git.co/dev-learn/ml-ubuntu/tensorflow/tflow-2nded# cd
root@sriov-guest:~# git clone https://github.com/ROCmSoftwarePlatform/Tensile.git tensile
Cloning into 'tensile'...
remote: Enumerating objects: 35642, done.
remote: Counting objects: 100% (1266/1266), done.
remote: Compressing objects: 100% (630/630), done.
remote: Total 35642 (delta 872), reused 911 (delta 618), pack-reused 34376
Receiving objects: 100% (35642/35642), 82.53 MiB | 4.60 MiB/s, done.
Resolving deltas: 100% (27248/27248), done.
root@sriov-guest:~# cp -r tensile/ tensile-clean
root@sriov-guest:~# cd tensile
root@sriov-guest:~/tensile# ls
CHANGELOG.md HostLibraryTests MANIFEST.in Tensile bump-version.sh docs requirements.txt tox.ini tuning_docs
CONTRIBUTING.md LICENSE.md README.md bin docker pytest.ini setup.py tuning
root@sriov-guest:~/tensile# git branch -r rocm -i
fatal: -a and -r options to 'git branch' do not make sense with a branch name
root@sriov-guest:~/tensile# git branch -r | grep -i rocm -i
origin/develop-rocm20
origin/master-rocm-2.1
origin/master-rocm-2.10
origin/master-rocm-2.2
origin/master-rocm-2.3
origin/master-rocm-2.4
origin/master-rocm-2.5
origin/master-rocm-2.6
origin/master-rocm-2.7
origin/master-rocm-2.8
origin/master-rocm-2.9
origin/master-rocm-3.0
origin/master-rocm-3.0.1
origin/master-rocm-3.1
origin/master-rocm-3.2
origin/master-rocm-3.3
origin/master-rocm-3.5
origin/release/rocm-rel-4.3
origin/release/rocm-rel-4.4
origin/rocm-3.10.x
origin/rocm-3.6.x
origin/rocm-3.7.x
origin/rocm-3.8.x
origin/rocm-3.9.x
origin/rocm-4.1.x
origin/rocm-4.2.x
origin/rocm-nm-3493
root@sriov-guest:~/tensile# git checkout rocm-4.2.x
Branch 'rocm-4.2.x' set up to track remote branch 'rocm-4.2.x' from 'origin'.
Switched to a new branch 'rocm-4.2.x'
root@sriov-guest:~/tensile# mkdir build
root@sriov-guest:~/tensile# cd build
root@sriov-guest:~/tensile/build# hsitory^C
root@sriov-guest:~/tensile/build# ../Tensile/bin/Tensile ../Tensile/Configs/rocblas_sgemm_asm_only.yaml ./
Message pack python library not detected. Must use YAML backend instead.
################################################################################
#
# Tensile v4.27.0
# Config: /root/tensile/Tensile/Configs/rocblas_sgemm_asm_only.yaml
#
################################################################################
# Restoring default globalParameters
# Detected local GPU with ISA: gfx900
cap gfx000 gfx803 gfx900 gfx906 gfx908 gfx1010 gfx1011
HasAddLshl 0 0 1 1 1 1 1
HasAtomicAdd 0 0 0 0 1 0 0
HasCodeObjectV3 0 0 0 0 0 0 0
HasDirectToLds 0 1 1 1 1 1 1
HasExplicitCO 0 0 1 1 1 1 1
HasExplicitNC 0 0 0 0 0 1 1
HasLshlOr 0 0 1 1 1 1 1
HasMFMA 0 0 0 0 1 0 0
HasSMulHi 0 0 1 1 1 1 1
MaxLgkmcnt 1 1 1 1 1 1 1
MaxVmcnt 0 1 1 1 1 1 1
SupportedISA 0 1 1 1 1 1 1
SupportedSource 1 1 1 1 1 1 1
v_dot2_f32_f16 0 0 0 1 1 0 1
v_dot2c_f32_f16 0 0 0 0 1 0 1
v_fma_f16 0 0 1 1 1 1 1
v_fma_mix_f32 0 0 0 1 1 1 1
v_mac_f16 0 1 1 1 1 0 0
v_mad_mix_f32 0 0 1 0 0 0 0
v_pk_fma_f16 0 0 1 1 1 1 1
CMPXWritesSGPR 1 1 1 1 1 0 0
HasEccHalf 0 0 0 1 1 0 0
HasWave32 0 0 0 0 0 1 1
SeparateVscnt 0 0 0 0 0 1 1
Waitcnt0Disabled 0 0 0 0 1 0 0
# Found hipcc version 4.2.21155-37cb3a34
# Command-line override: CodeObjectVersion
Overriding CodeObjectVersion=V3
Overriding CxxCompiler=hipcc
-- The C compiler identification is GNU 7.5.0
-- The CXX compiler identification is Clang 12.0.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /opt/rocm/bin/hipcc
-- Check for working CXX compiler: /opt/rocm/bin/hipcc -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- ROCclr at /opt/rocm/lib/cmake/rocclr
-- hip::amdhip64 is SHARED_LIBRARY
-- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS
-- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS - Success
LLVMObjectYAML_LIBRARY: /usr/lib/llvm-6.0/lib/libLLVMObjectYAML.a
-- Found Boost: /usr/include (found version "1.65.1") found components: program_options filesystem system
-- Found ROCmSMI: /opt/rocm/lib/librocm_smi64.so
-- Configuring done
CMake Warning at client/CMakeLists.txt:39 (add_executable):
Cannot generate a safe runtime search path for target tensile_client
because files in some directories may conflict with libraries in implicit
directories:
runtime library [libamdhip64.so.4] in /opt/rocm/hip/lib may be hidden by files in:
/opt/rocm/lib
Some of these libraries may not be found correctly.
-- Generating done
-- Build files have been written to: /root/tensile/build/0_Build
Scanning dependencies of target TensileHost
[ 2%] Building CXX object lib/CMakeFiles/TensileHost.dir/source/AMDGPU.cpp.o
[ 5%] Building CXX object lib/CMakeFiles/TensileHost.dir/source/ContractionSolution.cpp.o
[ 7%] Building CXX object lib/CMakeFiles/TensileHost.dir/source/Debug.cpp.o
[ 10%] Building CXX object lib/CMakeFiles/TensileHost.dir/source/ArithmeticUnitTypes.cpp.o
[ 13%] Building CXX object lib/CMakeFiles/TensileHost.dir/source/DataTypes.cpp.o
[ 18%] Building CXX object lib/CMakeFiles/TensileHost.dir/source/EmbeddedLibrary.cpp.o
[ 18%] Building CXX object lib/CMakeFiles/TensileHost.dir/source/ContractionProblem.cpp.o
[ 21%] Building CXX object lib/CMakeFiles/TensileHost.dir/source/KernelArguments.cpp.o
[ 23%] Building CXX object lib/CMakeFiles/TensileHost.dir/source/KernelLanguageTypes.cpp.o
[ 28%] Building CXX object lib/CMakeFiles/TensileHost.dir/source/TensorDescriptor.cpp.o
[ 28%] Building CXX object lib/CMakeFiles/TensileHost.dir/source/TensorOps.cpp.o
[ 31%] Building CXX object lib/CMakeFiles/TensileHost.dir/source/PerformanceMetricTypes.cpp.o
[ 34%] Building CXX object lib/CMakeFiles/TensileHost.dir/source/Tensile.cpp.o
[ 36%] Building CXX object lib/CMakeFiles/TensileHost.dir/source/llvm/YAML.cpp.o
[ 39%] Building CXX object lib/CMakeFiles/TensileHost.dir/source/Utils.cpp.o
[ 44%] Building CXX object lib/CMakeFiles/TensileHost.dir/source/hip/HipHardware.cpp.o
[ 44%] Building CXX object lib/CMakeFiles/TensileHost.dir/source/hip/HipSolutionAdapter.cpp.o
[ 47%] Building CXX object lib/CMakeFiles/TensileHost.dir/source/llvm/Loading.cpp.o
[ 50%] Linking CXX static library libTensileHost.a
[ 50%] Built target TensileHost
Scanning dependencies of target TensileClient
[ 52%] Building CXX object client/CMakeFiles/TensileClient.dir/source/BenchmarkTimer.cpp.o
[ 55%] Building CXX object client/CMakeFiles/TensileClient.dir/source/ClientProblemFactory.cpp.o
[ 57%] Building CXX object client/CMakeFiles/TensileClient.dir/source/CSVStackFile.cpp.o
[ 63%] Building CXX object client/CMakeFiles/TensileClient.dir/source/DataInitialization.cpp.o
[ 63%] Building CXX object client/CMakeFiles/TensileClient.dir/source/ConvolutionProblem.cpp.o
[ 65%] Building CXX object client/CMakeFiles/TensileClient.dir/source/HardwareMonitor.cpp.o
[ 68%] Building CXX object client/CMakeFiles/TensileClient.dir/source/HardwareMonitorListener.cpp.o
[ 71%] Building CXX object client/CMakeFiles/TensileClient.dir/source/MetaRunListener.cpp.o
[ 73%] Building CXX object client/CMakeFiles/TensileClient.dir/source/PerformanceReporter.cpp.o
[ 76%] Building CXX object client/CMakeFiles/TensileClient.dir/source/ProgressListener.cpp.o
[ 78%] Building CXX object client/CMakeFiles/TensileClient.dir/source/Reference.cpp.o
[ 81%] Building CXX object client/CMakeFiles/TensileClient.dir/source/ResultFileReporter.cpp.o
[ 84%] Building CXX object client/CMakeFiles/TensileClient.dir/source/SolutionIterator.cpp.o
[ 86%] Building CXX object client/CMakeFiles/TensileClient.dir/source/ReferenceValidator.cpp.o
[ 89%] Building CXX object client/CMakeFiles/TensileClient.dir/source/TimingEvents.cpp.o
[ 92%] Building CXX object client/CMakeFiles/TensileClient.dir/source/ResultReporter.cpp.o
[ 94%] Linking CXX static library libTensileClient.a
[ 94%] Built target TensileClient
Scanning dependencies of target tensile_client
[ 97%] Building CXX object client/CMakeFiles/tensile_client.dir/main.cpp.o
[100%] Linking CXX executable tensile_client
[100%] Built target tensile_client
################################################################################
# Converting Config to BenchmarkProcess Object
################################################################################
# Filling in Parameters With Defaults
# Convert Parameters to Steps
# Benchmark Common Parameters
# Fork Parameters
# Benchmark Fork Parameters
# Join Parameters
# Benchmark Join Parameters
# Benchmark Final
# NumBenchmarkSteps: 2
################################################################################
# Done Creating BenchmarkProcess Object
################################################################################
# Empty winners - use fast initialization of hardcodedParameters
################################################################################
# BenchmarkStep: Cijk_Ailk_Bljk_SB_00 - 00_BenchmarkFork 118.339s
# NumProblems: 1
# BenchmarkParameters:
# BenchmarkFork = { 0 }
Traceback (most recent call last):
File "../Tensile/bin/Tensile", line 36, in <module>
Tensile.main()
File "/root/tensile/Tensile/Tensile.py", line 283, in main
Tensile(sys.argv[1:])
File "/root/tensile/Tensile/Tensile.py", line 240, in Tensile
executeStepsInConfig(config)
File "/root/tensile/Tensile/Tensile.py", line 51, in executeStepsInConfig
BenchmarkProblems.main( config["BenchmarkProblems"] )
File "/root/tensile/Tensile/BenchmarkProblems.py", line 863, in main
problemSizeGroupConfig, problemSizeGroupIdx)
File "/root/tensile/Tensile/BenchmarkProblems.py", line 235, in benchmarkProblemType
benchmarkPermutations = constructForkPermutations(benchmarkStep.benchmarkParameters)
File "/root/tensile/Tensile/BenchmarkStructs.py", line 62, in constructForkPermutations
values = param[name]
TypeError: string indices must be integers
root@sriov-guest:~/tensile/build#
[END] 8/11/2021 6:40:19 PM
I think the Tensile project needn't build standalone. It is used by rocBLAS internal to build the GEMM.
I found that with some of tthe yaml file, build is OK, but with some it is not. WIth some yaml file, I so far tried only two: For example:
./Tensile/Tensile/bin/Tensile ../Tensile/Tensile/Configs/rocblas_sgemm_asm_only.yaml . (FAIL)
./Tensile/Tensile/bin/Tensile ../Tensile/Tensile/Configs/rocblas_dgemm_asm_lite (OK, built successfully)
./Tensile/Tensile/bin/Tensile ../Tensile/Tensile/Configs/rocblas_sgemm_asm_only.yaml . (FAIL):
File "../Tensile/Tensile/bin/Tensile", line 36, in <module>
Tensile.main()
File "/root/Tensile/Tensile/Tensile.py", line 283, in main
Tensile(sys.argv[1:])
File "/root/Tensile/Tensile/Tensile.py", line 223, in Tensile
assignGlobalParameters( config["GlobalParameters"] )
File "/root/Tensile/Tensile/Common.py", line 1682, in assignGlobalParameters
globalParameters["AsmCaps"][v] = GetAsmCaps(v)
File "/root/Tensile/Tensile/Common.py", line 1466, in GetAsmCaps
rv["HasDirectToLds"] = tryAssembler(isaVersion, "buffer_load_dword v40, v36, s[24:27], s28 offen offset:0 lds")
File "/root/Tensile/Tensile/Common.py", line 1527, in tryAssembler
result = subprocess.run(args, input=asmString.encode(), stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
File "/usr/lib/python3.6/subprocess.py", line 425, in run
stdout, stderr = process.communicate(input, timeout=timeout)
File "/usr/lib/python3.6/subprocess.py", line 863, in communicate
stdout, stderr = self._communicate(input, endtime, timeout)
File "/usr/lib/python3.6/subprocess.py", line 1534, in _communicate
ready = selector.select(timeout)
File "/usr/lib/python3.6/selectors.py", line 376, in select
fd_event_list = self._poll.poll(timeout)
KeyboardInterrupt
rocblas_dgemm_asm_lite.yaml:
================================================================================
================================ Memory Vendor =================================
GPU[0] : GPU memory vendor: unknown
================================================================================
============================= PCIe Replay Counter ==============================
GPU[0] : PCIe Replay Count: 0
================================================================================
================================ Serial Number =================================
GPU[0] : Serial Number: d599156e20f7fe7e
================================================================================
================================ KFD Processes =================================
No KFD PIDs currently running
================================================================================
============================= GPUs Indexed by PID ==============================
No KFD PIDs currently running
================================================================================
================== GPU Memory clock frequencies and voltages ===================
ERROR: 2 GPU[0]: od volt: RSMI_STATUS_NOT_SUPPORTED: This function is not supported in the current environment.
================================================================================
=============================== Current voltage ================================
GPU[0] : Voltage (mV): 756
================================================================================
================================== PCI Bus ID ==================================
GPU[0] : PCI Bus: 0000:00:07.0
================================================================================
============================= Firmware Information =============================
GPU[0] : ASD firmware version: 553648204
GPU[0] : CE firmware version: 0
GPU[0] : DMCU firmware version: 0
GPU[0] : MC firmware version: 0
GPU[0] : ME firmware version: 0
GPU[0] : MEC firmware version: 53
GPU[0] : MEC2 firmware version: 53
GPU[0] : PFP firmware version: 0
GPU[0] : RLC firmware version: 24
GPU[0] : RLC SRLC firmware version: 0
GPU[0] : RLC SRLG firmware version: 0
GPU[0] : RLC SRLS firmware version: 0
GPU[0] : SDMA firmware version: 14
GPU[0] : SDMA2 firmware version: 14
GPU[0] : SMC firmware version: 00.54.28.00
GPU[0] : SOS firmware version: 0x0017004f
GPU[0] : TA RAS firmware version: 27.00.01.37
GPU[0] : TA XGMI firmware version: 32.00.00.05
GPU[0] : UVD firmware version: 0x00000000
GPU[0] : VCE firmware version: 0x00000000
GPU[0] : VCN firmware version: 0x01101015
================================================================================
================================= Product Info =================================
GPU[0] : Card model: 0xc34
GPU[0] : Card vendor: Advanced Micro Devices, Inc. [AMD/ATI]
GPU[0] : Card SKU: D34304
================================================================================
================================== Pages Info ==================================
================================================================================
============================ Show Valid sclk Range =============================
ERROR: 2 GPU[0]: od volt: RSMI_STATUS_NOT_SUPPORTED: This function is not supported in the current environment.
GPU[0] : Unable to display sclk range
================================================================================
============================ Show Valid mclk Range =============================
ERROR: 2 GPU[0]: od volt: RSMI_STATUS_NOT_SUPPORTED: This function is not supported in the current environment.
GPU[0] : Unable to display mclk range
================================================================================
=========================== Show Valid voltage Range ===========================
ERROR: 2 GPU[0]: od volt: RSMI_STATUS_NOT_SUPPORTED: This function is not supported in the current environment.
GPU[0] : Unable to display voltage range
================================================================================
============================= Voltage Curve Points =============================
ERROR: 2 GPU[0]: od volt: RSMI_STATUS_NOT_SUPPORTED: This function is not supported in the current environment.
GPU[0] : Voltage Curve is not supported
================================================================================
WARNING: One or more commands failed
============================= End of ROCm SMI Log ==============================
Tensile::WARNING: ClientWriter Benchmark Process exited with code 2
Tensile::WARNING: BenchmarkProblems: Benchmark Process exited with code 2
# Get Results from CSV
Tensile::FATAL: Can't open "/root/build/1_BenchmarkProblems/Cijk_Ailk_Bljk_DB_00/Data/00_Final.csv" to get results
[END] 8/13/2021 10:23:53 PM
closing old issue