reframe
reframe copied to clipboard
nvcc compiler won't work on reframe but it works with spack
I executed spack install nvhpc
and it installed the nvhpc compilers. I then added the directories into .spack/compilers.yaml
file :
- compiler:
spec: nvhpc@=23.9
paths:
cc: /lustre/home/br-kolgu/spack/opt/spack/cray-rhel8-broadwell/gcc-13.1.0/nvhpc-23.9-glmhdcpn2c4zouhzuatdrdj7x7igniik/Linux_x86_64/2023/compilers/bin/nvcc
cxx: /lustre/home/br-kolgu/spack/opt/spack/cray-rhel8-broadwell/gcc-13.1.0/nvhpc-23.9-glmhdcpn2c4zouhzuatdrdj7x7igniik/Linux_x86_64/2023/compilers/bin/nvc++
f77: /lustre/home/br-kolgu/spack/opt/spack/cray-rhel8-broadwell/gcc-13.1.0/nvhpc-23.9-glmhdcpn2c4zouhzuatdrdj7x7igniik/Linux_x86_64/2023/compilers/bin/nvfortran
fc: /lustre/home/br-kolgu/spack/opt/spack/cray-rhel8-broadwell/gcc-13.1.0/nvhpc-23.9-glmhdcpn2c4zouhzuatdrdj7x7igniik/Linux_x86_64/2023/compilers/bin/nvfortran
flags: {}
operating_system: rhel8
target: any
modules: []
environment: {}
extra_rpaths: []
When I try to run
reframe -c benchmarks/apps/babelstream -r --tag thrust --system=isambard-macs:volta --setvar=num_cpus_per_task=40 -S build_locally=false -Sspack_spec='babelstream%[email protected] +thrust implementation=cuda cuda_arch=70 backend=cuda'
( Babelstream version : https://github.com/spack/spack/pull/41019/ ) It gives me the following error message :
==> Warning: duplicate found for gcc@=12.1.0 on rhel8/any. Edit your compilers.yaml configuration to remove it.
==> Error: ProcessError: Command exited with status 77:
'/var/tmp/pbs.81951.gw4head/br-kolgu/spack-stage/spack-stage-gmake-4.4.1-qfhzizskwnrobnf4s7eqplfqaam3ppui/spack-src/configure' '--prefix=/lustre/home/br-kolgu/excalibur-tests/benchmarks/spack/isambard-macs/volta/opt/cray-rhel8-cascadelake/nvhpc-23.9/gmake-4.4.1-qfhzizskwnrobnf4s7eqplfqaam3ppui' '--without-guile' '--disable-nls' '--disable-dependency-tracking'
2 errors found in build log:
6 checking for gawk... gawk
7 checking whether make sets $(MAKE)... yes
8 checking whether make supports nested variables... yes
9 checking whether make supports the include directive... yes (GNU sty
le)
10 checking for gcc... /lustre/home/br-kolgu/spack/lib/spack/env/nvhpc/
nvc
11 checking whether the C compiler works... no
>> 12 configure: error: in `/var/tmp/pbs.81951.gw4head/br-kolgu/spack-stag
e/spack-stage-gmake-4.4.1-qfhzizskwnrobnf4s7eqplfqaam3ppui/spack-src
/spack-build':
>> 13 configure: error: C compiler cannot create executables
14 See `config.log' for more details
See build log for details:
/var/tmp/pbs.81951.gw4head/br-kolgu/spack-stage/spack-stage-gmake-4.4.1-qfhzizskwnrobnf4s7eqplfqaam3ppui/spack-build-out.txt
==> Warning: Skipping build of babelstream-5.0-fkrqvhfz5jf3di3n26hwl5djcxaky4nm since gmake-4.4.1-qfhzizskwnrobnf4s7eqplfqaam3ppui failed
==> Error: babelstream-5.0-fkrqvhfz5jf3di3n26hwl5djcxaky4nm: Package was not installed
==> Error: Installation request failed. Refer to reported errors for failing package(s).
But this compiler works on when I try spack install ...
command so I believe there must be a step I am missing inside ReFrame to configure the compiler to picked up by ReFrame properly.
@kaanolgu if I am not mistaken the cc
in the compilers settings should point to nvc
not to nvcc. nvcc
is part of the cudatoolkit.
@teojgo It is nvcc inside the folder
@kaanolgu Could you share the generated build script from reframe? That's the rfm_build.sh
script inside the stage folder.
@vkarak sorry for delayed reply;
The rfm_build.sh
file is this :
#!/bin/bash -l
#PBS -N rfm_THRUSTBench
#PBS -o rfm_build.out
#PBS -e rfm_build.err
#PBS -l select=1:mpiprocs=1:ncpus=16:ngpus=1
#PBS -q voltaq
cd /lustre/home/br-kolgu/excalibur-tests-upstream/stage/isambard-macs/volta/default/THRUSTBenchmark_NVIDIA
_onerror()
{
exitcode=$?
echo "-reframe: command \`$BASH_COMMAND' failed (exit code: $exitcode)"
exit $exitcode
}
trap _onerror ERR
export OMP_NUM_THREADS=40
cp /lustre/home/br-kolgu/excalibur-tests-upstream/benchmarks/spack/common.yaml /lustre/home/br-kolgu/excalibur-tests-upstream/stage/isambard-macs/volta/default/THRUSTBenchmark_NVIDIA/common.yaml
cp -r /lustre/home/br-kolgu/excalibur-tests-upstream/benchmarks/spack/repo /lustre/home/br-kolgu/excalibur-tests-upstream/stage/isambard-macs/volta/default/THRUSTBenchmark_NVIDIA/repo
mkdir -p /lustre/home/br-kolgu/excalibur-tests-upstream/stage/isambard-macs/volta/default/THRUSTBenchmark_NVIDIA/spack_env
(cd /lustre/home/br-kolgu/excalibur-tests-upstream/benchmarks/spack/isambard-macs; find . \( -name "spack.yaml" -o -name "compilers.yaml" -o -name "packages.yaml" \) -print0 | xargs -0 tar cf - | tar -C /lustre/home/br-kolgu/excalibur-tests-upstream/stage/isambard-macs/volta/default/THRUSTBenchmark_NVIDIA/spack_env -xvf -)
spack -e /lustre/home/br-kolgu/excalibur-tests-upstream/stage/isambard-macs/volta/default/THRUSTBenchmark_NVIDIA/spack_env/volta config add "config:install_tree:root:/lustre/home/br-kolgu/excalibur-tests-upstream/benchmarks/spack/isambard-macs/volta/opt"
spack -e /lustre/home/br-kolgu/excalibur-tests-upstream/stage/isambard-macs/volta/default/THRUSTBenchmark_NVIDIA/spack_env/volta add babelstream%[email protected] +thrust thrust_backend=cuda cuda_arch=70 backend=cuda flags=-allow-unsupported-compiler
spack -e /lustre/home/br-kolgu/excalibur-tests-upstream/stage/isambard-macs/volta/default/THRUSTBenchmark_NVIDIA/spack_env/volta install
And the new error message is this rfm_build.err
:
==> Error: ProcessError: Command exited with status 1:
'/lustre/home/br-kolgu/excalibur-tests-upstream/benchmarks/spack/isambard-macs/volta/opt/cray-rhel8-cascadelake/gcc-13.1.0/cmake-3.27.7-vscc6vyb4iqwb3lzzwt64rsla7cv3gog/bin/cmake' '-G' 'Unix Makefiles' '-DCMAKE_INSTALL_PREFIX:STRING=/lustre/home/br-kolgu/excalibur-tests-upstream/benchmarks/spack/isambard-macs/volta/opt/cray-rhel8-cascadelake/nvhpc-23.9/babelstream-5.0-enzenbzkm6jy4hiy3oixso3ybwjv3jni' '-DCMAKE_BUILD_TYPE:STRING=Release' '-DCMAKE_INTERPROCEDURAL_OPTIMIZATION:BOOL=OFF' '-DCMAKE_VERBOSE_MAKEFILE:BOOL=ON' '-DCMAKE_INSTALL_RPATH_USE_LINK_PATH:BOOL=ON' '-DCMAKE_INSTALL_RPATH:STRING=/lustre/home/br-kolgu/excalibur-tests-upstream/benchmarks/spack/isambard-macs/volta/opt/cray-rhel8-cascadelake/nvhpc-23.9/babelstream-5.0-enzenbzkm6jy4hiy3oixso3ybwjv3jni/lib;/lustre/home/br-kolgu/excalibur-tests-upstream/benchmarks/spack/isambard-macs/volta/opt/cray-rhel8-cascadelake/nvhpc-23.9/babelstream-5.0-enzenbzkm6jy4hiy3oixso3ybwjv3jni/lib64;/cm/shared/apps/cuda11.2/toolkit/11.2.0/lib64' '-DCMAKE_PREFIX_PATH:STRING=/lustre/home/br-kolgu/excalibur-tests-upstream/benchmarks/spack/isambard-macs/volta/opt/cray-rhel8-cascadelake/nvhpc-23.9/thrust-1.16.0-4vzbtqauvqmgrogstre4xb4noiiwi5sg;/lustre/home/br-kolgu/excalibur-tests-upstream/benchmarks/spack/isambard-macs/volta/opt/cray-rhel8-cascadelake/gcc-13.1.0/cmake-3.27.7-vscc6vyb4iqwb3lzzwt64rsla7cv3gog;/cm/shared/apps/cuda11.2/toolkit/11.2.0;/cm/shared/apps/cuda11.2/toolkit/11.2.0/targets/x86_64-linux/lib/cmake' '-DMODEL=thrust' '-DTHRUST_IMPL=CUDA' '-SDK_DIR=/lustre/home/br-kolgu/excalibur-tests-upstream/benchmarks/spack/isambard-macs/volta/opt/cray-rhel8-cascadelake/nvhpc-23.9/thrust-1.16.0-4vzbtqauvqmgrogstre4xb4noiiwi5sg/include' '-DCUDA_ARCH=70' '-DCMAKE_CUDA_COMPILER=/cm/shared/apps/cuda11.2/toolkit/11.2.0/bin/nvcc' '-DCMAKE_CUDA_FLAGS=-ccbin /lustre/home/br-kolgu/spack/lib/spack/env/nvhpc/nvc' '-DBACKEND=CUDA' '-DCUDA_EXTRA_FLAGS=-allow-unsupported-compiler' '/var/tmp/pbs.83122.gw4head/br-kolgu/spack-stage/spack-stage-babelstream-5.0-enzenbzkm6jy4hiy3oixso3ybwjv3jni/spack-src'
1 error found in build log:
55 BACKEND = `CUDA`
56 MANAGED = `OFF`
57 CMAKE_CUDA_COMPILER = `/cm/shared/apps/cuda11.2/toolkit/11.2.0/bi
n/nvcc`
58 CUDA_ARCH = `70`
59 CUDA_EXTRA_FLAGS = `-allow-unsupported-compiler`
60
>> 61 CMake Error at /lustre/home/br-kolgu/excalibur-tests-upstream/benchm
arks/spack/isambard-macs/volta/opt/cray-rhel8-cascadelake/gcc-13.1.0
/cmake-3.27.7-vscc6vyb4iqwb3lzzwt64rsla7cv3gog/share/cmake-3.27/Modu
les/CMakeDetermineCompilerId.cmake:753 (message):
62 Compiling the CUDA compiler identification source file
63 "CMakeCUDACompilerId.cu" failed.
64
65 Compiler: /cm/shared/apps/cuda11.2/toolkit/11.2.0/bin/nvcc
66
67 Build flags:
See build log for details:
/var/tmp/pbs.83122.gw4head/br-kolgu/spack-stage/spack-stage-babelstream-5.0-enzenbzkm6jy4hiy3oixso3ybwjv3jni/spack-build-out.txt
The run command I use is this ;
reframe -c benchmarks/apps/babelstream -r --tag thrust --system=isambard-macs:volta --setvar=num_cpus_per_task=40 -S build_locally=false -Sspack_spec='babelstream%[email protected] +thrust thrust_backend=cuda cuda_arch=70 backend=cuda flags=-allow-unsupported-compiler'
And since we were working on the newer version of the spack package for babelstream, the new version is this : https://github.com/spack/spack/pull/41019
It hasn't merged in yet but other models are working with gcc but anything that uses oneapi
or nvhpc
compiler does not compile if that helps.
I could provide more information if needed
@giordano Do you maybe have a hint about this? So far, I'm not sure if ReFrame is at fault here.
@vkarak @giordano actually, I was able to reproduce the issue with spack environment too.
# This is a Spack Environment file.
#
# It describes a set of packages to be installed, along with
# configuration settings.
spack:
# add package specs to the `specs` list
specs:
- cuda
# - babelstream%[email protected]+cuda cuda_arch=70 # works
- babelstream%[email protected]+cuda cuda_arch=70 mem=managed #works
view: true
include:
- ./compilers.yaml
packages:
gmake:
externals:
- spec: [email protected]
prefix: /usr
concretizer:
unify: true
When I use this spack.yaml
file it builds without any issues but when I comment out the -cuda
line it gives me the same error message :
1 error found in build log:
54 specific short-term circumstances. Projects should be ported to the NEW
55 behavior and not rely on setting a policy to OLD.
56 Call Stack (most recent call first):
57 CMakeLists.txt:196 (setup)
58
59
>> 60 CMake Error at /lustre/home/br-kolgu/spack/opt/spack/cray-rhel8-broadwell/gcc-13.1.0/cmake-3.27.7-utysvikmqbmtirlmusucjwj4w536xjt2/share/cmake-3.27/Modules/CMakeDete
rmineCompilerId.cmake:753 (message):
61 Compiling the CUDA compiler identification source file
62 "CMakeCUDACompilerId.cu" failed.
63
64 Compiler:
65 /lustre/home/br-kolgu/spack/opt/spack/cray-rhel8-broadwell/nvhpc-23.9/cuda-10.0.130-kytmfgpgrgojj5fu3m26ozwp2gpo7avh/bin/nvcc
66
I could also share the concretize messages too in case it is needed
I will close this as it's not clear that it is a ReFrame issue. Feel free to reopen it if you have more evidence that ReFrame is at fault here.