BayesPrism icon indicating copy to clipboard operation
BayesPrism copied to clipboard

could not find function "sample.Z.theta_n"

Open doliv071 opened this issue 1 year ago • 16 comments

Hi all,

I've tested the program on my local machine and it works after significantly shrinking my datasets. So I've moved to a compute cluster to try to get things working on a less scaled down version of my data. I got the error below.

I guess my question is will BayesPrism will work if the bigmemory package is not installed? It isn't currently available on our cluster and I'm guessing this is where the error originates.

Run Gibbs sampling... 
Current time:  2022-08-31 09:22:58 
Estimated time to complete:  29mins 
Estimated finishing time:  2022-08-31 09:51:32 
Start run... 
R Version:  R version 4.2.0 (2022-04-22) 

snowfall 1.84-6.2 initialized (using snow 0.4-4): parallel execution on 32 CPUs.

Error in checkForRemoteErrors(val) : 
  8 nodes produced errors; first error: could not find function "sample.Z.theta_n"
> traceback()
12: stop(count, " nodes produced errors; first error: ", firstmsg)
11: checkForRemoteErrors(val)
10: staticClusterApply(cl, fun, length(x), argfun)
9: clusterApply(cl, splitList(x, length(cl)), lapply, fun, ...)
8: lapply(args, enquote)
7: do.call("fun", lapply(args, enquote))
6: docall(c, clusterApply(cl, splitList(x, length(cl)), lapply, 
       fun, ...))
5: parLapply(sfGetCluster(), x, fun, ...)
4: sfLapply(1:nrow(X), cpu.fun)
3: run.gibbs.refPhi(gibbsSampler.obj = gibbsSampler.obj, final = final, 
       compute.elbo = compute.elbo)
2: run.gibbs(gibbsSampler.ini.cs, final = FALSE)
1: run.prism(prism = myPrism, n.cores = 32)

doliv071 avatar Aug 31 '22 13:08 doliv071

HI David,

Thank you for your interest in our work.

BayesPrism does not have dependency on bigmemory. I am not sure what the cause is. If you can send me your prism object, I am happy to help troubleshoot.

Best,

Tinyi

On Wed, Aug 31, 2022 at 9:35 AM David Oliver @.***> wrote:

Hi all,

I've tested the program on my local machine and it works after significantly shrinking my datasets. So I've moved to a compute cluster to try to get things working on a less scaled down version of my data. I got the error below.

I guess my question is will BayesPrism will work if the bigmemory package is not installed? It isn't currently available on our cluster and I'm guessing this is where the error originates.

Run Gibbs sampling... Current time: 2022-08-31 09:22:58 Estimated time to complete: 29mins Estimated finishing time: 2022-08-31 09:51:32 Start run... R Version: R version 4.2.0 (2022-04-22)

snowfall 1.84-6.2 initialized (using snow 0.4-4): parallel execution on 32 CPUs.

Error in checkForRemoteErrors(val) : 8 nodes produced errors; first error: could not find function "sample.Z.theta_n"

traceback() 12: stop(count, " nodes produced errors; first error: ", firstmsg) 11: checkForRemoteErrors(val) 10: staticClusterApply(cl, fun, length(x), argfun) 9: clusterApply(cl, splitList(x, length(cl)), lapply, fun, ...) 8: lapply(args, enquote) 7: do.call("fun", lapply(args, enquote)) 6: docall(c, clusterApply(cl, splitList(x, length(cl)), lapply, fun, ...)) 5: parLapply(sfGetCluster(), x, fun, ...) 4: sfLapply(1:nrow(X), cpu.fun) 3: run.gibbs.refPhi(gibbsSampler.obj = gibbsSampler.obj, final = final, compute.elbo = compute.elbo) 2: run.gibbs(gibbsSampler.ini.cs, final = FALSE) 1: run.prism(prism = myPrism, n.cores = 32)

— Reply to this email directly, view it on GitHub https://github.com/Danko-Lab/BayesPrism/issues/10, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4NHS27TJFESDBMLS32VWDV35NQHANCNFSM6AAAAAAQBMI234 . You are receiving this because you are subscribed to this thread.Message ID: @.***>

tinyi avatar Sep 01 '22 01:09 tinyi

The object is quite large (approximately 20Gb) so not easily shared. I will try with a smaller dataset on the cluster and if that fails, attempt to re-install the package, it is possible there is an issue with my installation.

doliv071 avatar Sep 01 '22 15:09 doliv071

Hi David,

I ran into the same issue (after running into a different 'function not found' error). Are you running this on a Windows machine? I believe I found the solution to the error in a fork of this package in the meantime:

https://github.com/kevincjnixon/BayesPrism

kevincjnixon avatar Oct 05 '22 18:10 kevincjnixon

Hi Tinyi and team,

I am having the same issue and was wondering if you had any suggestions as to how to fix this? Many thanks, Hannah

hannahsfuchs avatar Sep 11 '23 11:09 hannahsfuchs

Hello,

I know I never followed up on this error, and I apologize for that. Since people are still suffering I will suggest a solution, that I'm not entirely certain will solve the problem as I've never worked with the snowfall package directly.

I believe the error is that the function sample.Z.theta_n is not exported to the sf cluster along with its parameters. Line 256 of run_gibbs.R currently reads:

sfExport("phi", "X", "alpha", "gibbs.idx", "seed", "compute.elbo")

If my hunch is correct, it should in fact read:

sfExport("phi", "X", "alpha", "gibbs.idx", "seed", "compute.elbo", "sample.Z.theta_n", "newJointPost")

So that the cluster will have access to these two functions.

Hope this helps. -Dave

doliv071 avatar Sep 11 '23 15:09 doliv071

Dear users,

Thank you for your feedback. Sorry that this fell through the cracks.

When parallelizing using snowfall, these functions (sample.Z.theta_n and newJointPost) should have been exported to the local environment, as they are exported in the namespace of BayesPrism package. But the behavior in windows machines might be different. I noticed that @kevincjnixon has added these functions in his fork which seems to have solved this issue. Feel free to try his fork https://github.com/kevincjnixon/BayesPrism and let me know if it works.

Best,

Tinyi

tinyi avatar Sep 11 '23 16:09 tinyi

Hi Tinyi,

Just glancing at the same code chunk in @kevincjnixon 's fork, I see that he has indeed corrected the export of sample.Z.theta_n

sfExport("phi", "X", "alpha", "gibbs.idx", "seed", "compute.elbo", "sample.Z.theta_n","sample.theta_n","rdirichlet", "Rcgminu")

edit: to clarify, I'm not using windows, I'm on a linux machine or linux cluster when running the code

doliv071 avatar Sep 11 '23 16:09 doliv071

This edit fixed the problem, many thanks all! Just as a side note, this problem occured on a Linux machine, so does not seem to be a windows-only problem.

hannahsfuchs avatar Sep 12 '23 10:09 hannahsfuchs

Dear users,

Thank you so much for your feedback and offering the solutions.

I was just testing this on R4.2.3 on CentOS Linux release 7.6.1810 (Core), but was not able to reproduce this error.

I suspect that this might be due to the snowfall version. Could any of you help print the details of R environment using sessionInfo()? I was hoping to understand the exact cause before creating a merge.

Many thanks.

Best,

Tinyi

tinyi avatar Sep 13 '23 17:09 tinyi

R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS:   /pbtech_mounts/software/spack/centos7/opt/spack/linux-centos7-x86_64/gcc-8.2.0/r-4.2.0-vhpax6qlxl2kguvqhnp4qlmbnlzdoyfv/rlib/R/lib/libRblas.so
LAPACK: /pbtech_mounts/software/spack/centos7/opt/spack/linux-centos7-x86_64/gcc-8.2.0/r-4.2.0-vhpax6qlxl2kguvqhnp4qlmbnlzdoyfv/rlib/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] BayesPrism_2.0      NMF_0.24.0          Biobase_2.56.0     
 [4] BiocGenerics_0.42.0 cluster_2.1.4       rngtools_1.5.2     
 [7] pkgmaker_0.32.2     registry_0.5-1      snowfall_1.84-6.2  
[10] snow_0.4-4         

loaded via a namespace (and not attached):
 [1] MatrixGenerics_1.8.1        edgeR_3.38.4               
 [3] BiocSingular_1.12.0         foreach_1.5.2              
 [5] scuttle_1.6.3               DelayedMatrixStats_1.18.0  
 [7] gtools_3.9.3                assertthat_0.2.1           
 [9] statmod_1.4.37              stats4_4.2.0               
[11] dqrng_0.3.0                 GenomeInfoDbData_1.2.8     
[13] pillar_1.8.1                lattice_0.20-45            
[15] limma_3.52.2                glue_1.6.2                 
[17] beachmat_2.12.0             digest_0.6.29              
[19] GenomicRanges_1.48.0        RColorBrewer_1.1-3         
[21] XVector_0.36.0              colorspace_2.0-3           
[23] Matrix_1.5-1                plyr_1.8.7                 
[25] pkgconfig_2.0.3             zlibbioc_1.42.0            
[27] purrr_0.3.4                 xtable_1.8-4               
[29] scales_1.2.1                ScaledMatrix_1.4.0         
[31] BiocParallel_1.30.3         tibble_3.1.8               
[33] generics_0.1.3              IRanges_2.30.1             
[35] ggplot2_3.3.6               withr_2.5.0                
[37] SummarizedExperiment_1.26.1 cli_3.4.0                  
[39] magrittr_2.0.3              fansi_1.0.3                
[41] bluster_1.6.0               doParallel_1.0.17          
[43] gplots_3.1.3                tools_4.2.0                
[45] lifecycle_1.0.2             matrixStats_0.62.0         
[47] gridBase_0.4-7              stringr_1.4.1              
[49] S4Vectors_0.34.0            locfit_1.5-9.6             
[51] munsell_0.5.0               DelayedArray_0.22.0        
[53] irlba_2.3.5                 compiler_4.2.0             
[55] GenomeInfoDb_1.32.4         rsvd_1.0.5                 
[57] caTools_1.18.2              rlang_1.0.5                
[59] grid_4.2.0                  RCurl_1.98-1.8             
[61] BiocNeighbors_1.14.0        iterators_1.0.14           
[63] SingleCellExperiment_1.18.0 igraph_1.3.4               
[65] bitops_1.0-7                gtable_0.3.1               
[67] codetools_0.2-18            DBI_1.1.3                  
[69] reshape2_1.4.4              R6_2.5.1                   
[71] dplyr_1.0.9                 utf8_1.2.2                 
[73] metapod_1.4.0               KernSmooth_2.23-20         
[75] stringi_1.7.8               parallel_4.2.0             
[77] Rcpp_1.0.9                  scran_1.24.0               
[79] vctrs_0.4.1                 tidyselect_1.1.2           
[81] sparseMatrixStats_1.8.0    

doliv071 avatar Sep 13 '23 18:09 doliv071

Hi all,

I've tested the program on my local machine and it works after significantly shrinking my datasets. So I've moved to a compute cluster to try to get things working on a less scaled down version of my data. I got the error below.

I guess my question is will BayesPrism will work if the bigmemory package is not installed? It isn't currently available on our cluster and I'm guessing this is where the error originates.

Run Gibbs sampling... 
Current time:  2022-08-31 09:22:58 
Estimated time to complete:  29mins 
Estimated finishing time:  2022-08-31 09:51:32 
Start run... 
R Version:  R version 4.2.0 (2022-04-22) 

snowfall 1.84-6.2 initialized (using snow 0.4-4): parallel execution on 32 CPUs.

Error in checkForRemoteErrors(val) : 
  8 nodes produced errors; first error: could not find function "sample.Z.theta_n"
> traceback()
12: stop(count, " nodes produced errors; first error: ", firstmsg)
11: checkForRemoteErrors(val)
10: staticClusterApply(cl, fun, length(x), argfun)
9: clusterApply(cl, splitList(x, length(cl)), lapply, fun, ...)
8: lapply(args, enquote)
7: do.call("fun", lapply(args, enquote))
6: docall(c, clusterApply(cl, splitList(x, length(cl)), lapply, 
       fun, ...))
5: parLapply(sfGetCluster(), x, fun, ...)
4: sfLapply(1:nrow(X), cpu.fun)
3: run.gibbs.refPhi(gibbsSampler.obj = gibbsSampler.obj, final = final, 
       compute.elbo = compute.elbo)
2: run.gibbs(gibbsSampler.ini.cs, final = FALSE)
1: run.prism(prism = myPrism, n.cores = 32)

Hi David,

Thank you very much for your information. Looking into the actual error message, I realized 8 cores out of 32 did not load the function. Also considering that you mentioned that when dataset was shrinked no error was reported, I believe this is a memeory overflow. I think reducing the number of cores should be able to avoid this problem too. Feel free to try using n.cores=something less than 8 to see if it works.

I will also discuss with our programmer team to decide if a merge with Kevin's fork is appropriate.

Best,

Tinyi

tinyi avatar Sep 13 '23 19:09 tinyi

Hi Tinyi,

Just glancing at the same code chunk in @kevincjnixon 's fork, I see that he has indeed corrected the export of sample.Z.theta_n

sfExport("phi", "X", "alpha", "gibbs.idx", "seed", "compute.elbo", "sample.Z.theta_n","sample.theta_n","rdirichlet", "Rcgminu")

edit: to clarify, I'm not using windows, I'm on a linux machine or linux cluster when running the code

Hi David,

Thank you for the proposed solution to the problem. Could you please tell me how exactly to implement the replacement in the code? I use Linux and have the same problem. When installing the package, all R files are stored in Rdb format. I don't understand how you managed to change the code of a certain file. I will be glad to receive your help!

Regards, Anastasia

Miirable avatar Sep 21 '23 19:09 Miirable

Hi Anastasia,

The easiest way would be to clone this repo to your local computer, modify the run_gibbs.R as I described above (not the example from the forked project as it has other variables that will likely cause a new error). Then you can install from a local directory with install.packages("path/to/modified/repo", repos = NULL, type = "source")

Best, Dave

doliv071 avatar Sep 21 '23 20:09 doliv071

Hi Anastasia,

The easiest way would be to clone this repo to your local computer, modify the run_gibbs.R as I described above (not the example from the forked project as it has other variables that will likely cause a new error). Then you can install from a local directory with install.packages("path/to/modified/repo", repos = NULL, type = "source")

Best, Dave

Hi Dave,

Thank you very much! I did everything as you said. However, when replacing the code I see a new error.

[Error in sfExport("phi", "X", "alpha", "gibbs.idx", "seed", "compute.elbo",  : 
  Unknown/unfound variable sample.Z.theta_n in export. (local=TRUE)]

Could you tell me what I'm doing wrong?

Regards, Anastasia

Miirable avatar Sep 22 '23 15:09 Miirable

Hi Anastasia,

As I mentioned above, I don't have any experience working directly with snowfall package so I'm not sure what that error is from. @hannahsfuchs said that she got it to work, maybe she can clarify the solution that worked for her.

Best, Dave

doliv071 avatar Sep 22 '23 16:09 doliv071

Hi all,

This should be caused by memory overflow. It is always a safe practice to reduce the number of thread.

I have tried to fix the bug and improved the memory usage of snowfall. The BayesPrism is now upgraded to v2.1. Feel free to try it and let me know if there is any issues.

Thanks.

Best,

Tinyi

tinyi avatar Sep 23 '23 05:09 tinyi