dyno icon indicating copy to clipboard operation
dyno copied to clipboard

Issue with singularity when running infer_trajectory

Open FloWuenne opened this issue 5 years ago • 20 comments

Hi,

thanks for developing this amazing package and for the incredible amount of work you put into benchmarking all of these tools, a great effort!

I was trying to run the tutorial to test dyno's functionalities. I am using dyno_0.9.9 and tidyverse_1.2.1.

I first tried on my local machine:

R version 3.5.2 (2018-12-20) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 18.04.2 LTS

I installed singularity and the checks from dynwrap::test_singularity_installation(detailed = TRUE) are all positive. I'm using singularity version 3.0.3.When I try to run any method using infer_trajectory I get the following error:

methods_selected <- c("slingshot" "paga_tree" "scorpius" "angle" )

model <- infer_trajectory(dataset, methods_selected[1], verbose = TRUE, debug = TRUE)

hi hi Error: FATAL: failed to retrieved path for /home/florian/Dyno_trajectories/.singularity/dynverse/ti_slingshot_tag-v0.9.9.simg: lstat /home/florian/Dyno_trajectories/.singularity/dynverse/ti_slingshot_tag-v0.9.9.simg: no such file or directory ERROR : Child exit with status 255

I have tried the same procedure on an HPC cluster that is runnig R 3.5.0 where singularity 3.1 is installed as a module that we can load. I get the exact same error that the simg file is not found.

Could you please help me troubleshoot this issue?

Thanks,

Florian

FloWuenne avatar Apr 04 '19 13:04 FloWuenne

Hi Florian

Thanks for trying out the package! Could you perhaps tell me whether the .singularity folder exists, and which files it contains?

Wouter

zouter avatar Apr 04 '19 14:04 zouter

Absolutely, the.singularity folder contains 2 folders:

  • dynverse - Empty
  • cache - Three subfolders 1) oci, 2)oci-tmp, 3) shub

FloWuenne avatar Apr 04 '19 14:04 FloWuenne

@FloWuenne Thanks for making this issue. I found what the problem was. Could you reinstall babelwhale to see if this solves your problem?

devtools::install_github("dynverse/babelwhale")

library(dyno)
data("fibroblast_reprogramming_treutlein")

babelwhale::set_default_config(babelwhale::create_singularity_config())

model <- infer_trajectory(fibroblast_reprogramming_treutlein, ti_slingshot(), verbose = TRUE)
plot_graph(model)

rcannood avatar Apr 05 '19 14:04 rcannood

Thanks for the fix @rcannood .

I tried your suggestion but I am running into another error now:

model <- infer_trajectory(fibroblast_reprogramming_treutlein, ti_slingshot(), verbose = TRUE) Executing 'slingshot' on 'id' With parameters: list(shrink = 1L, reweight = TRUE, reassign = TRUE, thresh = 0.001, maxit = 10L, stretch = 2L, smoother = "smooth.spline", shrink.method = "cosine"), inputs: counts, and priors : Loading required namespace: hdf5r Input saved to /tmp/Rtmp243EXd/file188bce0a052/ti Running /usr/local/bin/singularity run --pwd /ti/workspace -B
'/tmp/Rtmp243EXd/file188bce0a052/ti:/ti,/tmp/Rtmp243EXd/file188b586f5119/tmp:/tmp2'
'docker://dynverse/ti_slingshot:v0.9.9' --dataset /ti/input.h5 --output /ti/output.h5 WARNING: Could not set container working directory /ti/workspace: chdir /ti/workspace: no such file or directory Loading required package: dynutils Error in normalise(counts) : trying to get slot "x" from an object of a basic class ("matrix") with no slots Calls: <Anonymous> -> parse_dataset -> normalise Execution halted Error: Error during trajectory inference WARNING: Could not set container working directory /ti/workspace: chdir /ti/workspace: no such file or directory Loading required package: dynutils Error in normalise(counts) : trying to get slot "x" from an object of a basic class ("matrix") with no slots Calls: <Anonymous> -> parse_dataset -> normalise Execution halted

Is there an easy way I can fix this error from my side? Seems to me like it is trying to change into a directory that doesn't exist?

FloWuenne avatar Apr 05 '19 14:04 FloWuenne

At least we can already see slingshot running :)

You can ignore the chdir warning, it's a false warning that is being printed in some versions of singularity (sylabs/singularity#2778).

It seems I was cutting some corners with the code I provided earlier. Could you try again with the following code?

library(dyno)
data("fibroblast_reprogramming_treutlein")
dataset <- wrap_expression(
  counts = fibroblast_reprogramming_treutlein$counts,
  expression = fibroblast_reprogramming_treutlein$expression
)
model <- infer_trajectory(dataset, ti_slingshot(), verbose = TRUE)
plot_graph(model)

rcannood avatar Apr 05 '19 14:04 rcannood

Works like a charm now! Thank you so much for fixing this issue so quickly!

I tried with 3 different methods ("angle", "slingshot" and "scorpius") which all worked!

You can close the issue!

FloWuenne avatar Apr 05 '19 15:04 FloWuenne

Sorry for bringing up a small problem after closing the issue. I hadn't tested the fix on our HPC environment. On the HPC I now get the following error when running infer_trajectory() :

model <- infer_trajectory(dataset, ti_slingshot(), verbose = TRUE)

Error in processx::run("singularity", c("exec", paste0("docker://", container_id), : System command error

Might be related to different OS on the HPC node?

Any ideas of how to fix this problem?

FloWuenne avatar Apr 05 '19 18:04 FloWuenne

No problem! Could you maybe first try dynwrap::test_singularity_installation(detailed = TRUE) to make sure singularity is working fine? Thanks!

zouter avatar Apr 06 '19 02:04 zouter

Yes of course. I did try this before the fix and all the checks passed. I retried now after the fix and all the checks still pass!

dynwrap::test_singularity_installation(detailed = TRUE) ✔ Singularity is installed ✔ Singularity is at correct version (>=3.0): 3.1.0-35.ge15c551 is installed ✔ Singularity can pull and run a container from Dockerhub ✔ Singularity can mount temporary volumes ✔ Singularity test successful ------------------------------------------------------------ [1] TRUE

FloWuenne avatar Apr 06 '19 20:04 FloWuenne

Hi @FloWuenne

Something is going wrong while pulling the container (which is also the reason you don't see any output). Could you try to do singularity exec 'docker://dynverse/ti_slingshot:v0.9.9' echo hi to see where it complains?

Thanks

zouter avatar Apr 08 '19 09:04 zouter

Hi @zouter ,

thanks for helping to troubleshoot this issue!

I ran singularity exec 'docker://dynverse/ti_slingshot:v0.9.9' echo hi in the command line on the HPC login node and this was the error:

Writing manifest to image destination
Storing signatures
INFO:    Creating SIF file...
FATAL:   Unable to handle docker://dynverse/ti_slingshot:v0.9.9 uri: unable to build: While running mksquashfs: exit status 1: FATAL ERROR:Failed to create thread

FloWuenne avatar Apr 08 '19 12:04 FloWuenne

Oh dear, there's an error I hoped I'd never see again.

The direct cause for this is due to mksquashfs trying to take up all the cores that the machine has. I see this behaviour on cluster as well; at some point all the computer is fully loaded by squashfs (see screenshot below). An issue (sylabs/singularity#1228) was created for this a while ago, but afaik nothing has been done with it. We need some way to force mksquashfs to work with just one core...

screenshot

rcannood avatar Apr 08 '19 13:04 rcannood

I was hoping that by looking at the singularity source code, we would see a magic parameter that would allow us to add a -processors 1 parameter to the mksquashfs call, but that does not seem to be the case

rcannood avatar Apr 08 '19 13:04 rcannood

Hi guys I think this is very similar to Florian error but just in case I might help, that's what I got when I try to test singularity with dyno: FATAL: Unable to handle docker://alpine:3.7 uri: unable to build: While running mksquashfs: exit status 1: FATAL ERROR:Failed to create thread Error in test_singularity_installation() : ❌ Singularity is unable to run pull and run a container from Dockerhub. Cheers

gaelcge avatar Apr 08 '19 16:04 gaelcge

Hi again I think my first issue was because of the Hdf5r version. For the second issue (same as Flowuenne), not sure exactly why but seems to work on interactive session with a set of pre-dertermined CPUs (-n 2)...

gaelcge avatar Apr 08 '19 18:04 gaelcge

Update:

We have figured out that if running dyno inside an interactive session on our HPC, we do not run into the above-mentioned issue and are able to call methods from dyno inside R from the command line!

FloWuenne avatar Apr 08 '19 19:04 FloWuenne

Hi all I'm indeed afraid this indeed has something to do with how the HPC is configures and/or some bugs inside of singularity. For example, within our HPC:

  • Login/submit node: singularity runs perfectly
  • When ssh'in directly to one of the nodes: singularity runs perfectly
  • When using the queuing system to submit a job: singularity errors due to threading error
  • When using the queuing system to login on a node: singularity errors due to threading error

One way to solve this is first pull the singularity containers within a session where singularity can pull just fine. These containers are cached by default inside the .singularity/ folder, and thus you are free to run the container in another way as long as the singularity cache dir is specified correctly (see babelwhale::create_singularity_config() and babelwhale::set_default_config()).

zouter avatar Apr 09 '19 06:04 zouter

Thanks, @zouter for the clarification. I can confirm similar behavior on our HPC. It might be worth adding a note somewhere in the tutorial that dyno is saving singularity containers in (I am guessing) $HOME/.singularity?

We were wondering yesterday where the containers are actually stored and your thread helped us directly with this! Is there a way to change where dyno stores the singularity containers? I'm asking because our HPC has limited home space.

Thanks a lot for all the effort!

FloWuenne avatar Apr 09 '19 13:04 FloWuenne

Yes, that should be added to the tutorial for sure, thanks for the suggestion. The default behaviour is actually saving it to $pwd/.singularity, which is also default behaviour of singularity pull.

You're certainly not alone with wanting to change the location of the singularity containers on a HPC. To do this, you can either use the environment variable SINGULARITY_CACHEDIR (which was an environment variable present in singularity 2 but removed completely in singularity 3 for whatever reason), or by doing

config <- babelwhale::create_singularity_config(cache_dir = "...")
babelwhale::set_default_config(config)

(babelwhale is the package that we use for communication with docker and singularity, dynwrap reexports several babelwhale functions so you can also do dynwrap::....)

Hope this helps

zouter avatar Apr 09 '19 14:04 zouter

Hi all,

Thanks for this thread, it's been really helpful to me in trying to set up dynverse to run on our cluster. Forgive me as I'm new to using containers, so I'm having a lot of trouble getting the singularity containers to run on the cluster. I have pulled all the TI method images to my local machine and then uploaded them to the cluster, but I can't for the life of me get singularity to recognize that directory as the cache directory to run out of. When I use the babelwhale call as suggested (like below)

##Specify where to place singularity containers
singularity_config <- dynwrap::set_default_config(dynwrap::create_singularity_config(cache_dir = "/hpc/users/cohenp05/Comp1_Container/"))
babelwhale::set_default_config(singularity_config)

and point the singularity cache_dir to my directory, I get this error:

FATAL: Unable to handle docker://alpine:3.7 uri: failed to get SHA of docker://alpine:3.7: pinging docker registry returned: Get https://registry-1.docker.io/v2/: dial tcp 52.22.67.152:443: connect: network is unreachable

Here is my sessionInfo: `R version 3.5.3 (2019-03-11) Platform: x86_64-pc-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core)

Matrix products: default BLAS/LAPACK: /hpc/packages/minerva-centos7/intel/parallel_studio_xe_2019/compilers_and_libraries_2019.0.117/linux/mkl/lib/intel64_lin/libmkl_gf_lp64.so

locale: [1] C

attached base packages: [1] splines stats4 parallel stats graphics grDevices utils
[8] datasets methods base

other attached packages: [1] rlist_0.4.6.1 forcats_0.4.0 stringr_1.4.0
[4] purrr_0.3.2 readr_1.3.1 tidyr_0.8.3
[7] tibble_2.1.1 tidyverse_1.2.1 dyno_0.1.1
[10] dynwrap_1.1.3 dynplot_1.0.1 dynmethods_1.0.4
[13] dynguidelines_1.0.0 dynfeature_1.0.0 RColorBrewer_1.1-2 [16] data.table_1.12.2 monocle_2.10.1 DDRTree_0.1.5
[19] irlba_2.3.3 VGAM_1.1-1 Biobase_2.42.0
[22] BiocGenerics_0.28.0 Matrix_1.2-15 gridExtra_2.3
[25] Seurat_3.0.2 dplyr_0.8.1 plyr_1.8.4
[28] cowplot_0.9.4 ggplot2_3.1.1

loaded via a namespace (and not attached): [1] reticulate_1.12 R.utils_2.8.0 tidyselect_0.2.5
[4] htmlwidgets_1.3 grid_3.5.3 combinat_0.0-8
[7] ranger_0.11.2 docopt_0.6.1 Rtsne_0.15
[10] munsell_0.5.0 codetools_0.2-16 ica_1.0-2
[13] future_1.13.0 withr_2.1.2 colorspace_1.4-1
[16] fastICA_1.2-1 knitr_1.23 rstudioapi_0.10
[19] ROCR_1.0-7 gbRd_0.4-11 listenv_0.7.0
[22] Rdpack_0.11-0 slam_0.1-45 polyclip_1.10-0
[25] farver_1.1.0 pheatmap_1.0.12 generics_0.0.2
[28] xfun_0.7 R6_2.4.0 GA_3.2
[31] rsvd_1.0.0 pdist_1.2 bitops_1.0-6
[34] assertthat_0.2.1 promises_1.0.1 SDMTools_1.1-221.1
[37] scales_1.0.0 ggraph_1.0.2 nnet_7.3-12
[40] gtable_0.3.0 babelwhale_0.0.0.9000 npsurv_0.4-0
[43] globals_0.12.4 processx_3.3.1 tidygraph_1.1.2
[46] rlang_0.3.4 lazyeval_0.2.2 acepack_1.4.1
[49] broom_0.5.2 checkmate_1.9.3 yaml_2.2.0
[52] reshape2_1.4.3 modelr_0.1.4 backports_1.1.4
[55] httpuv_1.5.1 Hmisc_4.2-0 tools_3.5.3
[58] gplots_3.0.1.1 ggridges_0.5.1 Rcpp_1.0.1
[61] base64enc_0.1-3 densityClust_0.3 ps_1.3.0
[64] rpart_4.1-15 pbapply_1.4-0 viridis_0.5.1
[67] dynparam_1.0.0 zoo_1.8-5 haven_2.1.0
[70] ggrepel_0.8.1 cluster_2.0.7-1 magrittr_1.5
[73] carrier_0.1.0 lmtest_0.9-37 RANN_2.6.1
[76] fitdistrplus_1.0-14 matrixStats_0.54.0 hms_0.4.2
[79] patchwork_0.0.1 lsei_1.2-0 mime_0.6
[82] xtable_1.8-4 readxl_1.3.1 sparsesvd_0.1-4
[85] HSMMSingleCell_1.2.0 testthat_2.1.1 compiler_3.5.3
[88] KernSmooth_2.23-15 crayon_1.3.4 rje_1.9
[91] R.oo_1.22.0 htmltools_0.3.6 proxyC_0.1.4
[94] later_0.8.0 Formula_1.2-3 RcppParallel_4.4.2
[97] lubridate_1.7.4 tweenr_1.0.1 MASS_7.3-51.1
[100] cli_1.1.0 R.methodsS3_1.7.1 gdata_2.18.0
[103] metap_1.1 igraph_1.2.4.1 pkgconfig_2.0.2
[106] foreign_0.8-71 plotly_4.9.0 xml2_1.2.0
[109] foreach_1.4.4 dynutils_1.0.3 rvest_0.3.4
[112] bibtex_0.4.2 digest_0.6.19 dyndimred_1.0.1
[115] sctransform_0.2.0 tsne_0.1-3 cellranger_1.1.0
[118] htmlTable_1.13.1 shiny_1.3.2 gtools_3.8.1
[121] nlme_3.1-137 jsonlite_1.6 viridisLite_0.3.0
[124] limma_3.38.3 pillar_1.4.0 lattice_0.20-38
[127] httr_1.4.0 survival_2.44-1.1 glue_1.3.1
[130] remotes_2.0.4 qlcMatrix_0.9.7 FNN_1.1.3
[133] png_0.1-7 iterators_1.0.10 ggforce_0.2.2
[136] stringi_1.4.3 latticeExtra_0.6-28 caTools_1.17.1.2
[139] future.apply_1.2.0 ape_5.3 `

I've also tried running the containers in a bash script by exporting my dynverse object and a dimensional reduction object to h5 files and running singularity run directly on the image like so:

ml singularity
module load R/3.5.3
SINGULARITY_CACHEDIR="/hpc/users/cohenp05/Comp1_Container/"
export SINGULARITY_CACHEDIR

singularity cache list

singularity run /hpc/users/cohenp05/Comp1_Container/ti_comp1_latest.sif --dataset=/hpc/users/cohenp05/Dyno_Objects/Monocyte_Dyno.h5 --output=/hpc/users/cohenp05/Dyno_Objects/Monocyte_Comp1.h5 --dimred=/hpc/users/cohenp05/Dyno_Objects/Monocytes_UMAP.h5

And then I get this error

Error: --dataset should contain a pathname of a .loom or .h5 file. Add a '-h' flag for help.
Execution halted

Please help! I'm out of ideas.

cohenp05 avatar Jul 25 '19 16:07 cohenp05