balston
balston
Started the build on Kathleen.
The R build on Kathleen is now completed. Just need to check and test ...
On Myriad I'm submitting test jobs for doMPI and snow.
R 4.2.3 doMPI example works correctly.
The snow example has failed. ``` -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpirun detected...
I've now added: ``` export OMPI_MCA_state_base_verbose=5 export OMPI_MCA_mca_base_component_show_load_errors=1 ``` and get this additional error info: ``` > # Get a reference to our snow cluster that has been set up...
I've also submitted a bare Rmpi job which might help diagnose what is going on
The Rmpi test job worked correctly. So I modified the snow test job to start the MPI R slaves in the same way and it now works too.
So: - **doMPI** works using _gerun Rscript doMPI_example.R_ for example to start it; - **Rmpi** works with _mpirun -np 1 R CMD BATCH rmpitest1.R_ to start it; - **snow** now...
Now running tests on Kathleen: - **doMPI** example across 2 compute nodes works.