kokkos-kernels
kokkos-kernels copied to clipboard
compile failure due to missing file KokkosLapack_tpl_spec.hpp
I'm getting the following error when build our application:
In file included from /scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/lapack/impl/KokkosLapack_gesv_spec.hpp:130,
from /scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/build/ascicgpu/trilinos_cde_v3-gnu-OpenMPI-cuda_11_2-OpenMP-static-Release/packages/kokkos-kernels/lapack/eti/generated_specializations_cpp/gesv/Lapack_gesv_eti_DOUBLE_LAYOUTLEFT_EXECSPACE_CUDA_MEMSPACE_CUDASPACE.cpp:20:
/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/lapack/tpls/KokkosLapack_gesv_tpl_spec_decl.hpp:278:10: fatal error: KokkosLapack_tpl_spec.hpp: No such file or directory
This is kokkos and kokkos-kernels as contained within Trilinos/develop
as of this morning.
The entire compile line is pretty messy, but here it is:
/projects/cde/v3/cee/spack/opt/spack/linux-rhel7-x86_64/gcc-10.3.0/openmpi-4.1.2-ated23f2cr5tikdwru4hsr7pl25jk2bp/bin/mpicxx -DKOKKOS_DEPENDENCE -I/projects/gemma_user/magma-2.6.2/cuda-11.2.2/include -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/build/ascicgpu/trilinos_cde_v3-gnu-OpenMPI-cuda_11_2-OpenMP-static-Release -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/build/ascicgpu/trilinos_cde_v3-gnu-OpenMPI-cuda_11_2-OpenMP-static-Release/packages/kokkos-kernels/blas -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/blas -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/build/ascicgpu/trilinos_cde_v3-gnu-OpenMPI-cuda_11_2-OpenMP-static-Release/packages/kokkos-kernels/lapack -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/lapack -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/build/ascicgpu/trilinos_cde_v3-gnu-OpenMPI-cuda_11_2-OpenMP-static-Release/packages/kokkos-kernels/graph -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/graph -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/ode -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/build/ascicgpu/trilinos_cde_v3-gnu-OpenMPI-cuda_11_2-OpenMP-static-Release/packages/kokkos-kernels -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/common/src -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/common/impl -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/common/unit_test -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/batched -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/build/ascicgpu/trilinos_cde_v3-gnu-OpenMPI-cuda_11_2-OpenMP-static-Release/packages/kokkos-kernels/batched/eti -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/batched/dense/src -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/batched/dense/impl -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/batched/dense/unit_test -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/batched/sparse/src -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/batched/sparse/impl -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/batched/sparse/unit_test -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/blas/src -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/blas/impl -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/blas/eti -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/build/ascicgpu/trilinos_cde_v3-gnu-OpenMPI-cuda_11_2-OpenMP-static-Release/packages/kokkos-kernels/blas/eti -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/blas/tpls -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/lapack/src -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/lapack/impl -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/lapack/eti -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/build/ascicgpu/trilinos_cde_v3-gnu-OpenMPI-cuda_11_2-OpenMP-static-Release/packages/kokkos-kernels/lapack/eti -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/lapack/tpls -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/graph/src -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/graph/impl -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/graph/eti -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/build/ascicgpu/trilinos_cde_v3-gnu-OpenMPI-cuda_11_2-OpenMP-static-Release/packages/kokkos-kernels/graph/eti -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/sparse/src -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/sparse/impl -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/sparse/eti -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/build/ascicgpu/trilinos_cde_v3-gnu-OpenMPI-cuda_11_2-OpenMP-static-Release/packages/kokkos-kernels/sparse/eti -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/sparse/tpls -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/ode/src -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/ode/impl -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos-kernels/ode/unit_test -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/build/ascicgpu/trilinos_cde_v3-gnu-OpenMPI-cuda_11_2-OpenMP-static-Release/packages/kokkos/core/src -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos/core/src -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/build/ascicgpu/trilinos_cde_v3-gnu-OpenMPI-cuda_11_2-OpenMP-static-Release/packages/kokkos -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos/core/src/../../tpls/desul/include -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/build/ascicgpu/trilinos_cde_v3-gnu-OpenMPI-cuda_11_2-OpenMP-static-Release/packages/kokkos/containers/src -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos/containers/src -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/build/ascicgpu/trilinos_cde_v3-gnu-OpenMPI-cuda_11_2-OpenMP-static-Release/packages/kokkos/algorithms/src -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos/algorithms/src -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/build/ascicgpu/trilinos_cde_v3-gnu-OpenMPI-cuda_11_2-OpenMP-static-Release/packages/kokkos/simd/src -I/scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/Trilinos/packages/kokkos/simd/src -pedantic -Wall -Wno-long-long -Wwrite-strings -extended-lambda -Wext-lambda-captures-this -arch=sm_70 -DADD_ -fopenmp -lgfortran -O3 -DNDEBUG -w -extended-lambda -Wext-lambda-captures-this -arch=sm_70 -std=c++17 -MD -MT packages/kokkos-kernels/CMakeFiles/kokkoskernels.dir/lapack/eti/generated_specializations_cpp/gesv/Lapack_gesv_eti_DOUBLE_LAYOUTLEFT_EXECSPACE_CUDA_MEMSPACE_CUDASPACE.cpp.o -MF CMakeFiles/kokkoskernels.dir/lapack/eti/generated_specializations_cpp/gesv/Lapack_gesv_eti_DOUBLE_LAYOUTLEFT_EXECSPACE_CUDA_MEMSPACE_CUDASPACE.cpp.o.d -o CMakeFiles/kokkoskernels.dir/lapack/eti/generated_specializations_cpp/gesv/Lapack_gesv_eti_DOUBLE_LAYOUTLEFT_EXECSPACE_CUDA_MEMSPACE_CUDASPACE.cpp.o -c /scratch/gemmaops/jenkins/workspace/trilinos-ascicgpu-cde-cuda-openmp-static-release/build/ascicgpu/trilinos_cde_v3-gnu-OpenMPI-cuda_11_2-OpenMP-static-Release/packages/kokkos-kernels/lapack/eti/generated_specializations_cpp/gesv/Lapack_gesv_eti_DOUBLE_LAYOUTLEFT_EXECSPACE_CUDA_MEMSPACE_CUDASPACE.cpp
Pulling in @vqd8a who might know more, or be more helpful in answering any questions...
@glhenni can you share your CMake configuration line for Trilinos?
@ndellingwood In Kokkos Kernels 4.1, when gesv
was in blas
, KokkosBlas_tpl_spec.hpp
was included in KokkosLapack_gesv_tpl_spec_decl.hpp
. But now in Kokkos Kernels 4.2, since gesv
was moved to lapack
directory, we should have a similar file as KokkosLapack_tpl_spec.hpp
.
@lucbv
@vqd8a yeah, KokkosLapack_tpl_spec.hpp
is not present. Looks like enabling CUSOLVER or MAGMA should reproduce
It's tricky because we use a settings file, or files, for a lot of our settings rather than specifying all of them via -DCMAKE_VAR=VALUE
in the cmake invocation itself. If I had to guess the one that's biting us is -DKokkosKernels_ENABLE_TPL_MAGMA=ON
. We are using MAGMA and cmake is finding it fine.
If you REALLY need me to generate a standalone cmake command line for our build let me know. I'll translate our settings.cmake file into the equivalent cmake -D
invocation.
If you REALLY need me to generate a standalone cmake command line for our build let me know. I'll translate our settings.cmake file into the equivalent
cmake -D
invocation.
No, that shouldn't be necessary, I think knowing Magma was enabled is a good clue. Thanks!
Looks like @lucbv has wip for the CuSolver case in #2038 Edit: fixed the PR number
@ndellingwood @lucbv There is a new KokkosLapack_cusolver.hpp
in that PR. It looks to me that we need KokkosLapack_magma.hpp
too.
Looks like the Magma-related stuff from blas/tpls/KokkosBlas_Cuda_tpl.hpp
was copied to lapack/tpls/KokkosLapack_Cuda_tpl
We're missing the Magma-related stuff (possibly other stuff) from blas/tpls/KokkosBlas_tpl_spec.hpp
(no lapack counterpart)
@ndellingwood the work I have in the cuSOLVER PR will eventually fix the problem but I do not think that it would be appropriate to cherry pick for a patch release though. I see that your commits above are more strategic about getting only the small subset needed though. Let me know if you need help with it? Ultimately we need to get MAGMA on Weaver or Caraway to reproduce this...
Let me know if you need help with it?
@lucbv thanks, the updates seemed straightforward but I'm still hitting compilation errors, may follow up with you for some help if I get stuck
we need to get MAGMA on Weaver or Caraway to reproduce this
yeah, I built my own copy on Weaver for now, I'll put in a request for the TPL and share the config (they'll use spack, but config should help with the recipe)
the work I have in the cuSOLVER PR will eventually fix the problem but I do not think that it would be appropriate to cherry pick for a patch release though.
@lucbv yeah, and since it follows other updates (e.g. rocsolver) we'd probably have to pull in extra stuff for a clean patch. For now for an eventual Trilinos patch, in addition to the magma fixes, should we remove the cusolver stub from the lapack updates (in particular, KokkosLapack_Cuda_tpl.hpp, which includes the problematic KokkosLapack_tpl_spec.hpp)?
Yeah if that can be removed without breaking anything that would be good actually. As long as MAGMA still works that should be good enough.
A fix for 4.2.00 (against release-candidate-4.2.01) issued with #2044, and Trilinos PR https://github.com/trilinos/Trilinos/pull/12555 , @glhenni hopefully these resolve the issue in Trilinos (the PR is set for AUTOMERGE)