heatherkellyucl
heatherkellyucl
Still fails 2 node with ucx anyway.
We'll fix the RMPISNOW issues and come back to Rmpi alone at a later date. Last bit of diagnosis for Rmpi, this time on Kathleen. I set this, the first...
C.01 exists in the UCL Software Database now.
Having a look at what the available binaries are: **Version 2.14 (2020-08-05) Platforms:** * Linux-x86_64-multicore-CUDA (NVIDIA CUDA acceleration) * Linux-x86_64-netlrts-smp-CUDA (Multi-copy algorithms, single process per copy) * Linux-x86_64-verbs-smp-CUDA (InfiniBand, no...
`NAMD_2.14_Linux-x86_64-multicore-CUDA` binary seems to have found the GPU and done something with it. Was run with 1 GPU and 36 cores as ``` ../NAMD_2.14_Linux-x86_64-multicore-CUDA/namd2 +p${NSLOTS} +setcpuaffinity ../apoa1/apoa1_nve_cuda.namd ``` ``` Charm++:...
Also worked and allocated cores with 4 gpus: ``` Pe 16 physical rank 16 binding to CUDA device 1 on node-l00a-001.myriad.ucl.ac.uk: 'A100-PCIE-40GB' Mem: 40536MB Rev: 8.0 PCI: 0:2f:0 Pe 32...
After discussion (was partway through modifying our current buildscripts which use Intel to ones using GCC and CUDA): we should update the CUDA modules so they no longer require the...
Hmm, the Intel 2018 module has symlinks in its `intel64/lib` directory to a bunch of libraries that are in `release_mt` including libmpi.so and .a. Our Intel 2019 install does not...
Charm++ does not like that combo :( ``` /lustre/shared/ucl/apps/gcc/10.2.0-p95889/bin/../include/c++/10.2.0/bits/atomic_base.h(74): error: invalid redefinition of enum "std::memory_order" (declar ed at line 168 of "/lustre/shared/ucl/apps/intel/2019.Update5/compilers_and_libraries_2019.5.281/linux/compiler/include/stdatomic.h") typedef enum memory_order compilation aborted for DummyLB.C (code...
Charm++ builds, then NAMD configure complains: ``` ERROR: MPI-based Charm++ arch mpi-linux-x86_64-iccstatic is not compatible with CUDA NAMD. ERROR: Non-SMP Charm++ arch mpi-linux-x86_64-iccstatic is not compatible with CUDA NAMD. ERROR:...