AMGX icon indicating copy to clipboard operation
AMGX copied to clipboard

OpenFOAM2022+AmgX+PETSc-after determining tmp storage requirements for exclusive_scan: cudaErrorInvalidDevice: invalid device ordinal

Open snsmssss opened this issue 3 years ago • 2 comments

After initial hiccups with building components, I am now facing problems with running icoFoam (OpenFOAM) with execution dumping core with the following Error message.I am running this on Laptop having GEForce RTX 3060 - GPU memory is 6GB - Appreciate your help

Number of GPU devices :: 1 AMGX version 2.2.0.132-opensource Built on Aug 4 2022, 23:33:25 Compiled with CUDA Runtime 11.7, using CUDA driver 11.7 Cannot read file as JSON object, trying as AMGX config Converting config string to current config version Parsing configuration string: exception_handling=1 ; Initializing AmgX-p Initializing AmgX Linear Solver p terminate called after throwing an instance of 'thrust::system::system_error' what(): after determining tmp storage requirements for exclusive_scan: cudaErrorInvalidDevice: invalid device ordinal Caught signal 6 - SIGABRT (abort) /usr/lib/openfoam/amgx/build/libamgxsh.so : ()+0xf8864c /lib/x86_64-linux-gnu/libpthread.so.0 : ()+0x14420 /lib/x86_64-linux-gnu/libc.so.6 : gsignal()+0xcb /lib/x86_64-linux-gnu/libc.so.6 : abort()+0x12b /lib/x86_64-linux-gnu/libstdc++.so.6 : ()+0x9e911 /lib/x86_64-linux-gnu/libstdc++.so.6 : ()+0xaa38c /lib/x86_64-linux-gnu/libstdc++.so.6 : ()+0xaa3f7 /lib/x86_64-linux-gnu/libstdc++.so.6 : ()+0xaa6a9 /home/sambath/OpenFOAM/sambath-v2112/platforms/linux64GccDPInt32Opt/lib/libfoam2csr.so : int* thrust::cuda_cub::detail::exclusive_scan_n_impl<thrust::cuda_cub::par_t, int*, long, int*, int, thrust::plus >(thrust::cuda_cub::execution_policythrust::cuda_cub::par_t&, int*, long, int*, int, thrust::plus)+0x32a /home/sambath/OpenFOAM/sambath-v2112/platforms/linux64GccDPInt32Opt/lib/libfoam2csr.so : AmgXCSRMatrix::setValuesLDU(int, int, int, int, int, int const*, int const*, int, int const*, int const*, double const*, double const*, double const*, double const*)+0x9c1 /home/sambath/OpenFOAM/sambath-v2112/platforms/linux64GccDPInt32Opt/lib/libpetscFoam.so : Foam::amgxSolver::offloadMatrixArrays(AmgXCSRMatrix&, int&, int&, int&) const+0x761 /home/sambath/OpenFOAM/sambath-v2112/platforms/linux64GccDPInt32Opt/lib/libpetscFoam.so : Foam::amgxSolver::scalarSolve(Foam::Field&, Foam::Field const&, unsigned char) const+0x687 /home/sambath/OpenFOAM/sambath-v2112/platforms/linux64GccDPInt32Opt/lib/libpetscFoam.so : Foam::amgxSolver::solve(Foam::Field&, Foam::Field const&, unsigned char) const+0x48 /usr/lib/openfoam/openfoam2112/platforms/linux64GccDPInt32Opt/lib/libfiniteVolume.so : Foam::fvMatrix::solveSegregated(Foam::dictionary const&)+0x625 /usr/lib/openfoam/openfoam2112/platforms/linux64GccDPInt32Opt/lib/libfiniteVolume.so : Foam::fvMatrix::solveSegregatedOrCoupled(Foam::dictionary const&)+0x498 /usr/lib/openfoam/openfoam2112/platforms/linux64GccDPInt32Opt/lib/libfiniteVolume.so : Foam::fvMesh::solve(Foam::fvMatrix&, Foam::dictionary const&) const+0x28 icoFoam : ()+0x289e9 /lib/x86_64-linux-gnu/libc.so.6 : __libc_start_main()+0xf3 icoFoam : ()+0x2a28e Aborted (core dumped)

snsmssss avatar Aug 10 '22 19:08 snsmssss

It seems you are using CUDA 11.7, but the solution has only been tested with CUDA 11.2 (I believe that's hinted in the build docs). We are working on support for CUDA 11.7, but we aren't finished yet. Pull request #189 might fix the error above, but you might encounter further issues. The solution should work with CUDA 11.4.

mattmartineau avatar Aug 25 '22 09:08 mattmartineau

@snsmssss Do you still have this issue? Few changes related to your issue has been merged, so it might be worth trying latest code.

marsaev avatar Oct 25 '22 09:10 marsaev