rcps-buildscripts icon indicating copy to clipboard operation
rcps-buildscripts copied to clipboard

R: install R 4.2.3 with Bioconductor 3.16 [IN05922352]

Open balston opened this issue 1 year ago • 34 comments

User cannot install Bioconductor package inferCNV in R 4.2.0 and neither can I. I've also tried it using the latest R 4.2.2 Myriad installation and it still doesn't work.

Doing this should I think resolve the problem. R 4.2.3 was released on 15th March.

Need this on Myriad first.

balston avatar Apr 17 '23 14:04 balston

This cannot really wait for the Spack R build process so firstly updating the base R plus recommended packages build script.

balston avatar Apr 17 '23 14:04 balston

Running from build_scripts:

module -f unload compilers mpi gcc-libs
./R-4.2.3_install 2>&1 | tee ~/Software/R/R-4.2.3_install.log-17042023

balston avatar Apr 17 '23 15:04 balston

Build finished without errors. It also runs checks and the R regression tests and they have worked without errors as well. Will update the R base module file and then get the additional packages build script updated.

balston avatar Apr 17 '23 15:04 balston

Running from build_scripts:

./R-4.2.3_packages_install 2>&1 | tee ~/Software/R/R-4.2.3_packages_install.log-17042023

balston avatar Apr 17 '23 16:04 balston

The additional packages build finished about 11pm last night. Today I will check the output for errors.

We add about 675 additional packages (including automatic dependencies) which is why it takes so long to run!

balston avatar Apr 18 '23 08:04 balston

There is at least one real error:

Package: rgl:

** R
** demo
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
*** copying figures
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
sh: line 1: 208295 Segmentation fault      R_TESTS= '/shared/ucl/apps/R/R-4.2.3-OpenBLAS/lib64/R/bin/R' --no-save --no-restore --no-echo 2>&1 < '/tmp/RtmpVUcYfl/file2f3bb5766cdb8'
ERROR: loading failed
* removing ‘/lustre/shared/ucl/apps/R/R-4.2.3-OpenBLAS/lib64/R/library/rgl’

The downloaded source packages are in
	‘/tmp/Rtmpgz3U7J/downloaded_packages’
Warning message:
In install.packages("rgl", lib = mainLib, repos = repros) :
  installation of package ‘rgl’ had non-zero exit status

Not sure if this is a real error:

package RcppParallel:

** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded from temporary location
** checking absolute paths in shared objects and dynamic libraries
readelf: Error: /lustre/shared/ucl/apps/R/R-4.2.3-OpenBLAS/lib64/R/library/00LOCK-RcppParallel/00new/RcppParallel/lib/libtbb.so: Failed to read file header
readelf: Error: /lustre/shared/ucl/apps/R/R-4.2.3-OpenBLAS/lib64/R/library/00LOCK-RcppParallel/00new/RcppParallel/lib/libtbbmalloc.so: Failed to read file header
readelf: Error: /lustre/shared/ucl/apps/R/R-4.2.3-OpenBLAS/lib64/R/library/00LOCK-RcppParallel/00new/RcppParallel/lib/libtbbmalloc_proxy.so: Failed to read file header
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (RcppParallel)
*

balston avatar Apr 18 '23 13:04 balston

The error building the rgl packages was caused by the runtime test failing due to an issue with the XQuartz X server running on my MacBook - even running the glxgears example failed!

Solution - use a working X server supporting OpenGL - in my case a Linux VM. Now running:

./R-4.2.3_single_package_install rgl 2>&1 | tee ~/Software/R/R-4.2.3_rgl_install.log

from the VM works.

balston avatar Apr 18 '23 14:04 balston

The RcppParallel package appears to have been installed correctly.

balston avatar Apr 18 '23 15:04 balston

Added MPI support packages:

./R-4.2.3_MPI_install 2>&1 | tee ~/Software/R/R-4.2.3_MPI_install.log-18042023

without errors.

balston avatar Apr 18 '23 15:04 balston

Module bundle done and single Bioconductor package build script done. To use R 4.2.3 the following module command are needed:

module -f unload compilers mpi gcc-libs
module load beta-modules
module load r/r-4.2.3_bc-3.16

balston avatar Apr 18 '23 15:04 balston

OK I've now tried to install inferCNV with R 4.2.3 from my userid using:

module -f unload compilers mpi gcc-libs
module load beta-modules
module load r/r-4.2.3_bc-3.16
R

if (!requireNamespace ("BiocManager"))
    install.packages ("BiocManager")
BiocManager::install ()
BiocManager::install ("infercnv")

And it still fails but only with the following error (had several errors before):

ERROR: dependency ‘rjags’ is not available for package ‘infercnv’
* removing ‘/lustre/home/ccaabaa/R/x86_64-pc-linux-gnu-library/4.2/infercnv’

This is because I don't have a version of jags that can be used with R 4.2.3.

Solution = build one!

balston avatar Apr 19 '23 09:04 balston

JAGS 4.3.2 built and modules updated.

balston avatar Apr 19 '23 12:04 balston

Trying to build the rjags package but it is failing.

balston avatar Apr 19 '23 14:04 balston

OK I've worked out what the problem is - JAGS 4.3.2 is too new for the version of rjags on CRAN. I've done a test build of JAGS 4.3.1 in my Scratch and it all works when I install rjags from my R session.

Will now build everything in /shared/ucl/apps ...

balston avatar Apr 20 '23 10:04 balston

JAGS 4.3.1 installed, modules updated. Installing rjags package centrally using:

./R-4.2.3_single_package_install rjags 'configure.args="--enable-rpath"'

works:

** R
** data
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded from temporary location
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (rjags)

Before rjags had been failing at the ** testing if installed package can be loaded from temporary location stage.

balston avatar Apr 20 '23 12:04 balston

I've now been able to install the infercnv package from a clean R session in my account:

* installing *source* package ‘infercnv’ ...
** using staged installation
** R
** data
*** moving datasets to lazyload DB
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (infercnv)

balston avatar Apr 20 '23 13:04 balston

I've emailed the user about InverCNV.

balston avatar Apr 20 '23 13:04 balston

Next task is to check out the R MPI stuff.

balston avatar Apr 21 '23 11:04 balston

Started the build on Kathleen.

balston avatar May 12 '23 09:05 balston

The R build on Kathleen is now completed. Just need to check and test ...

balston avatar May 15 '23 11:05 balston

On Myriad I'm submitting test jobs for doMPI and snow.

balston avatar Aug 01 '23 09:08 balston

R 4.2.3 doMPI example works correctly.

balston avatar Aug 01 '23 09:08 balston

The snow example has failed.

--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[3999,1],4]
  Exit code:    1
--------------------------------------------------------------------------

balston avatar Aug 01 '23 09:08 balston

I've now added:

export OMPI_MCA_state_base_verbose=5 
export OMPI_MCA_mca_base_component_show_load_errors=1

and get this additional error info:

> # Get a reference to our snow cluster that has been set up by the RMPISNOW
> # script.
> cl <- getMPIcluster ()
>
> # Display info about each process in the cluster
> print(clusterCall(cl, function() Sys.info()))
[node-b00a-014.myriad.ucl.ac.uk:178470] [[28092,0],0] ACTIVATE PROC [[28092,1],4] STATE EXITED WITH NON-ZERO STATUS AT base/odls_base_default_fns.c:1741

balston avatar Aug 01 '23 09:08 balston

I've also submitted a bare Rmpi job which might help diagnose what is going on

balston avatar Aug 01 '23 10:08 balston

The Rmpi test job worked correctly. So I modified the snow test job to start the MPI R slaves in the same way and it now works too.

balston avatar Aug 01 '23 14:08 balston

So:

  • doMPI works using gerun Rscript doMPI_example.R for example to start it;
  • Rmpi works with mpirun -np 1 R CMD BATCH rmpitest1.R to start it;
  • snow now works with mpirun -np 1 R CMD BATCH snow_example.R to start it.

Both the Rmpi and snow jobscripts need to have:

export OMPI_MCA_mtl="^psm2"
export OMPI_MCA_pml="cm"

in them. Within the R scripts the R MPI slaves can be launched using for snow:

cl <- makeCluster( (mpi.universe.size()-1) , type='MPI' )

and Rmpi:

ns <- mpi.universe.size() - 1
mpi.spawn.Rslaves(nslaves=ns)

balston avatar Aug 01 '23 14:08 balston

Now running tests on Kathleen:

  • doMPI example across 2 compute nodes works.

balston avatar Aug 01 '23 16:08 balston

The Rmpi test job on Kathleen has failed:

It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_dpm_dyn_init() failed
  --> Returned "Not found" (-13) instead of "Success" (0)
--------------------------------------------------------------------------
[node-c11a-022:01929] *** An error occurred in MPI_Init
[node-c11a-022:01929] *** reported by process [4095934466,49]
[node-c11a-022:01929] *** on a NULL communicator
[node-c11a-022:01929] *** Unknown error
[node-c11a-022:01929] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[node-c11a-022:01929] ***    and potentially your MPI job)
[node-c11a-008:67709] 39 more processes have sent help message help-mpi-runtime.txt / mpi_init:startup:internal-failure
[node-c11a-008:67709] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
[node-c11a-008:67709] 39 more processes have sent help message help-mpi-errors.txt / mpi_errors_are_fatal unknown handle

balston avatar Aug 02 '23 09:08 balston

Have submitted a job testing snow on Kathleen since it doesn't say above if that works or not.

heatherkellyucl avatar Nov 20 '23 12:11 heatherkellyucl