heatherkellyucl

Results 212 comments of heatherkellyucl

Myriad needs UCX 1.9.0 for OpenMPI 4.1.1 (bug in 1.8.0) to be able to run multi-node, changing to that.

Now running fine multi-node on Myriad too.

These modules needed on not-Myriad: ``` module unload -f compilers mpi module load compilers/gnu/4.9.2 module load numactl/2.0.12 module load psm2/11.2.185/gnu-4.9.2 module load mpi/openmpi/4.1.1/gnu-4.9.2 ``` These needed on Myriad: ``` module...

Is *not* working across two nodes on Young... ``` node-c12m-005.22538PSM2 can't open hfi unit: -1 (err=23) node-c12l-008.62402PSM2 can't open hfi unit: -1 (err=23) ``` ``` node-c12m-005.22538hfi_userinit_internal: assign_context command failed: Device...

For now we have set `OMPI_MCA_btl=vader` in the modulefile for `mpi/openmpi/4.1.1/gnu-4.9.2` on the OmniPath clusters so it will work multi-node, even if a bit slower than it should if using...

Running `linaroforge-23.1.1_install` on Kathleen to see if the rest of the steps are the same.

Install appears to have worked. `linaroforge/23.1.1` module on Kathleen for testing.

Looks like we had `/shared/ucl/apps/armforge/20.1.2/templates/basic-sge-mpi.qtf` in the previous version which is not in the buildscript.

Added `files/linaroforge/ucl-basic-sge-mpi.qtf`.