rcps-buildscripts
rcps-buildscripts copied to clipboard
Install Request: GPU build of LAMMPS [IN05118528] [IN05126499]
EPSRC work.
Current stable version is 29 Sep 2021 update 2.
https://github.com/lammps/lammps/releases https://docs.lammps.org/stable/
According to the install documentation LAMMPS now suggests using cmake to build instead of system tailored Makefiles. Previous builds we have done used the tailored Makefile method.
I'm going to start using the cmake method and try a test build. This will need a new build script.
So we are building LAMMPS 29th September 2021 Update 2. Source downloadable from:
https://github.com/lammps/lammps/archive/refs/tags/stable_29Sep2021_update2.tar.gz
We need two new build scripts:
lammps-29Sep21_2-basic_install
lammps-29Sep21_2-gpu_install
I'm making first versions of them now.
First attempt at build script ready. Running as ccspapp:
cd /shared/ucl/apps/build_scripts
./lammps-29Sep21_2-basic_install 2>&1 | tee ~/Software/LAMMPS/lammps-29Sep21_2-basic_install.log-19012022-1
First attempt failed:
CMake Error at /lustre/shared/ucl/apps/cmake/3.21.1/gnu-4.9.2/share/cmake-3.21/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
Could NOT find Python (missing: Python_INCLUDE_DIRS Python_LIBRARIES
Development Development.Module Development.Embed)
Call Stack (most recent call first):
/lustre/shared/ucl/apps/cmake/3.21.1/gnu-4.9.2/share/cmake-3.21/Modules/FindPackageHandleStandardArgs.cmake:594 (_FPHSA_FAILURE_MESSAGE)
/lustre/shared/ucl/apps/cmake/3.21.1/gnu-4.9.2/share/cmake-3.21/Modules/FindPython.cmake:556 (find_package_handle_standard_args)
Modules/Packages/PYTHON.cmake:6 (find_package)
CMakeLists.txt:445 (include)
-- Configuring incomplete, errors occurred!
Need to load a Python bundle.
This time it has passed the configuration stage and is compiling stuff.
Build finished with no obvious errors but I will need to check the build log carefully.
Added in building shared libraries as this isn't the default and is needed for Plugin loading.
Build has finished and the shared library has been built.
I've now also added an option to build LAMMPS unit tests. Run it like this:
BUILD_UNIT_TESTS=yes ./lammps-29Sep21_2-basic_install 2>&1 | tee ~/Software/LAMMPS/lammps-29Sep21_2-basic_install.log-24012022-1
Updating the GPU build script with the updates from lammps-29Sep21_2-basic_install with the GPU stuff added.
Running the unit test is failing. Running:
module -f unload compilers mpi
module load compilers/nvidia/hpc-sdk/22.1
module load python3/recommended
cd /home/ccspapp/Software/LAMMPS/tmp.tDAcUNTvaj/lammps-stable_29Sep2021_update2/build
ctest -V
gives:
1: HWLOC_HIDE_ERRORS=1
1: Test timeout computed to be: 1500
1: /home/ccspapp/Software/LAMMPS/tmp.tDAcUNTvaj/lammps-stable_29Sep2021_update2/build/lmp: /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by /home/ccspapp/Software/LAMMPS/tmp.tDAcUNTvaj/lammps-stable_29Sep2021_update2/build/liblammps.so.0)
So need to use a more up to date gcc-libs module. I'm also going to build versions using GNU compilers (10.2.0) and OpenMPI plus CUDA 11 for the GPU build as well as the Nvidia versions.
Now build revised basic Nvidia version.
GNU compilers and OpenMPI build script ready to test.
Running:
BUILD_UNIT_TESTS=yes ./lammps-29Sep21_2-basic-gnu_install 2>&1 | tee ~/Software/LAMMPS/lammps-29Sep21_2-basic-gnu_install.log-26012022-1
to build GNU version with unit tests.
The Nvivia build is still not working correctly so I have switched to the GNU build for the moment.
GNU build for basic version has completed. I've quickly started the first couple of unit tests and they pass. So will submit a job to run the full set tomorrow.
Will now set up the GPU version build script.
NOTE: the following modules are needed for the build and runtime for the basic version:
module -f unload compilers mpi gcc-libs
module load beta-modules
module load gcc-libs/10.2.0
module load compilers/gnu/10.2.0
module load numactl/2.0.12
module load binutils/2.36.1/gnu-10.2.0
module load ucx/1.9.0/gnu-10.2.0
module load mpi/openmpi/4.0.5/gnu-10.2.0
module load cmake/3.21.1
module load python3/3.9-gnu-10.2.0
Unit Test job for the GNU basic version submitted.
Unit tests all passed:
100% tests passed, 0 tests failed out of 481
Total Test time (real) = 436.38 sec
Now need a module file and can then try running some real examples.
GNU version of GPU build script ready to run using these configuration options:
cmake -C ../cmake/presets/gcc.cmake -C ../cmake/presets/most.cmake -D GPU_API=cuda -D GPU_PREC=mixed -D GPU_ARCH=sm_80 -D BUILD_SHARED_LIBS=yes -D CMAKE_INSTALL_PREFIX=${INSTALL_PREFIX} ../cmake
If you can give it multiple GPU architectures, worth doing sm_60, sm_70 and sm_80 so it works on all Myriad's GPUs.
According to the documentation setting -D GPU_ARCH=sm_80 is a default and it should also include "support for all major GPU architectures supported by" the loaded CUDA module. sm_80 is the current latest GPU architecture supported by LAMMPS and is in CUDA 11.
Running GPU build:
module -f unload gcc-libs
module load beta-modules
BUILD_UNIT_TESTS=yes ./lammps-29Sep21_2-gpu-gnu_install 2>&1 | tee ~/Software/LAMMPS/lammps-29Sep21_2-gpu-gnu_install.log-27012022-1
GPU build has finished.
Checking to see if it has built correctly.
More work needed on the build script - it has completely ignored building with CUDA!
IN05118528 wants to use the basic (MPI) version on Young now.
Basic GNU build version module ready to use for testing on Myriad. Needs the following module commands:
module -f unload compilers mpi gcc-libs
module load beta-modules
module load gcc-libs/10.2.0
module load compilers/gnu/10.2.0
module load python3/3.9-gnu-10.2.0
# The following three are only needed on Myriad.
module load numactl/2.0.12
module load binutils/2.36.1/gnu-10.2.0
module load ucx/1.9.0/gnu-10.2.0
module load mpi/openmpi/4.0.5/gnu-10.2.0
module load lammps/29sep21up2/basic/gnu-10.2.0
Building the basic non GPU version on Kathleen to test multi-node stuff prior to building on Young.
Fixed (I think) the GPU build script to actually built the GPU version! Building it again.