ArchR icon indicating copy to clipboard operation
ArchR copied to clipboard

GroupCoverage file path doesn't update saveArchRProject

Open evaham1 opened this issue 11 months ago • 5 comments

Hello and thanks for the great tool,

I would like to re-open the closed bug issue #529 - I am also getting the same problem that my GroupCoverage file path doesn't update. I have installed ArchR 1.0.3 from the dev branch using devtools::install_github('GreenleafLab/ArchR', ref='dev', repos = BiocManager::repositories()) however I still get the same error when I run saveArchRProject when I no longer have access to the old data directory.

Error message:

Error in saveArchRProject(ArchRProj = ArchR, outputDirectory = paste0(rds_path,  : 
  all(file.exists(zfiles)) is not TRUE

From inspecting ArchR@projectMetadata$GroupCoverages[[1]]$coverageMetadata$File when using ArchR_1.0.2 and ArchR_1.0.3 I can see that the coverage paths are still the same, although in #529 it looks like the paths should be updated on dev. Am I missing a step somewhere to update the group coverage paths?

Also I would like to iterate the above that it would be amazing to have this fix incorporated into a stable release not just for sharing data but also for running ArchR in reproducible pipelines.

Thanks for your help!

sessionInfo()

R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.5 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3

Random number generation:
 RNG:     L'Ecuyer-CMRG 
 Normal:  Inversion 
 Sample:  Rejection 
 
locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
 [1] parallel  stats4    grid      stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] SeuratObject_4.1.3          Seurat_4.3.0                presto_1.0.0                pheatmap_1.0.12            
 [5] hexbin_1.28.2               GenomicFeatures_1.46.5      AnnotationDbi_1.56.2        forcats_0.5.1              
 [9] dplyr_1.1.2                 purrr_1.0.1                 readr_2.1.4                 tidyr_1.3.0                
[13] tibble_3.2.1                tidyverse_1.3.1             rhdf5_2.38.1                SummarizedExperiment_1.24.0
[17] Biobase_2.54.0              MatrixGenerics_1.6.0        Rcpp_1.0.10                 Matrix_1.5-4               
[21] GenomicRanges_1.46.1        GenomeInfoDb_1.30.1         IRanges_2.28.0              S4Vectors_0.32.4           
[25] BiocGenerics_0.40.0         matrixStats_1.0.0           data.table_1.14.8           stringr_1.5.0              
[29] plyr_1.8.8                  magrittr_2.0.3              ggplot2_3.4.2               gtable_0.3.3               
[33] gtools_3.9.4                gridExtra_2.3               ArchR_1.0.3                 getopt_1.20.3              
[37] BiocManager_1.30.20        

loaded via a namespace (and not attached):
  [1] utf8_1.2.3               spatstat.explore_3.2-1   reticulate_1.30          tidyselect_1.2.0         RSQLite_2.3.1           
  [6] htmlwidgets_1.6.2        BiocParallel_1.28.3      Rtsne_0.16               devtools_2.4.5           munsell_0.5.0           
 [11] codetools_0.2-18         ica_1.0-3                future_1.32.0            miniUI_0.1.1.1           withr_2.5.0             
 [16] spatstat.random_3.1-5    colorspace_2.1-0         progressr_0.13.0         filelock_1.0.2           rstudioapi_0.14         
 [21] ROCR_1.0-11              tensor_1.5               listenv_0.9.0            GenomeInfoDbData_1.2.7   polyclip_1.10-4         
 [26] bit64_4.0.5              rprojroot_2.0.3          parallelly_1.36.0        vctrs_0.6.3              generics_0.1.3          
 [31] BiocFileCache_2.2.1      R6_2.5.1                 bitops_1.0-7             rhdf5filters_1.6.0       spatstat.utils_3.0-3    
 [36] cachem_1.0.8             DelayedArray_0.20.0      assertthat_0.2.1         promises_1.2.0.1         BiocIO_1.4.0            
 [41] scales_1.2.1             Cairo_1.6-0              globals_0.16.2           processx_3.8.1           goftest_1.2-3           
 [46] rlang_1.1.1              splines_4.1.2            rtracklayer_1.54.0       lazyeval_0.2.2           spatstat.geom_3.2-1     
 [51] broom_0.7.12             yaml_2.3.7               reshape2_1.4.4           abind_1.4-5              modelr_0.1.8            
 [56] backports_1.4.1          httpuv_1.6.11            usethis_2.2.0            tools_4.1.2              ellipsis_0.3.2          
 [61] RColorBrewer_1.1-3       sessioninfo_1.2.2        ggridges_0.5.4           progress_1.2.2           zlibbioc_1.40.0         
 [66] RCurl_1.98-1.12          ps_1.7.5                 prettyunits_1.1.1        deldir_1.0-9             pbapply_1.7-0           
 [71] urlchecker_1.0.1         cowplot_1.1.1            zoo_1.8-12               haven_2.4.3              ggrepel_0.9.3           
 [76] cluster_2.1.2            fs_1.6.2                 scattermore_1.2          lmtest_0.9-40            reprex_2.0.1            
 [81] RANN_2.6.1               fitdistrplus_1.1-11      pkgload_1.3.2            hms_1.1.3                patchwork_1.1.2         
 [86] mime_0.12                xtable_1.8-4             XML_3.99-0.14            readxl_1.3.1             compiler_4.1.2          
 [91] biomaRt_2.50.3           KernSmooth_2.23-20       crayon_1.5.2             htmltools_0.5.5          later_1.3.1             
 [96] tzdb_0.4.0               lubridate_1.8.0          DBI_1.1.3                dbplyr_2.1.1             MASS_7.3-54             
[101] rappdirs_0.3.3           cli_3.6.1                igraph_1.5.0             pkgconfig_2.0.3          GenomicAlignments_1.30.0
[106] sp_1.6-1                 plotly_4.10.2            spatstat.sparse_3.0-1    xml2_1.3.4               XVector_0.34.0          
[111] rvest_1.0.2              callr_3.7.3              digest_0.6.31            sctransform_0.3.5        RcppAnnoy_0.0.20        
[116] spatstat.data_3.0-1      Biostrings_2.62.0        cellranger_1.1.0         leiden_0.4.3             uwot_0.1.14             
[121] restfulr_0.0.15          curl_5.0.1               shiny_1.7.4              Rsamtools_2.10.0         rjson_0.2.21            
[126] nlme_3.1-153             lifecycle_1.0.3          jsonlite_1.8.5           Rhdf5lib_1.16.0          desc_1.4.2              
[131] viridisLite_0.4.2        fansi_1.0.4              pillar_1.9.0             lattice_0.20-45          pkgbuild_1.4.1          
[136] KEGGREST_1.34.0          fastmap_1.1.1            httr_1.4.6               survival_3.2-13          remotes_2.4.2           
[141] glue_1.6.2               png_0.1-8                bit_4.0.5                profvis_0.3.8            stringi_1.7.12          
[146] blob_1.2.4               memoise_2.0.1            irlba_2.3.5.1            future.apply_1.11.0  

evaham1 avatar Jul 08 '23 17:07 evaham1

Hi @evaham1! Thanks for using ArchR! Please make sure that your post belongs in the Issues section. Only bugs and error reports belong in the Issues section. Usage questions and feature requests should be posted in the Discussions section, not in Issues.
It is worth noting that there are very few actual bugs in ArchR. If you are getting an error, it is probably something specific to your dataset, usage, or computational environment, all of which are extremely challenging to troubleshoot. As such, we require reproducible examples (preferably using the tutorial dataset) from users who want assistance. If you cannot reproduce your error, we will not be able to help. Before going through the work of making a reproducible example, search the previous Issues, Discussions, function definitions, or the ArchR manual and you will likely find the answers you are looking for. If your post does not contain a reproducible example, it is unlikely to receive a response.
__In addition to a reproducible example, you must do the following things before we help you, unless your original post already contained this information: 1. If you've encountered an error, have you already searched previous Issues to make sure that this hasn't already been solved? 2. Did you post your log file? If not, add it now. 3. Remove any screenshots that contain text and instead copy and paste the text using markdown's codeblock syntax (three consecutive backticks). You can do this by editing your original post.

rcorces avatar Jul 08 '23 17:07 rcorces

I understand that some people are still having this issue but I just do not have time to look into this right now. If you create a reproducible example using the tutorial data, I will be much more likely to look back into this.

rcorces avatar Jul 14 '23 15:07 rcorces

Hello @evaham1,

I am getting the same error and installed ArchR 1.0.3 from the dev branch as well..

Did you find a fix for your data?

Best Wishes, C

cedmo001 avatar Oct 19 '23 14:10 cedmo001

The error is related to paths to Group Coverage files. Unlike the ArchRProject itself, it seems that saveArchRProject does not always update the paths to the GroupCoverage files, mean that if you copy an ArchRProject to a new directory, it could still have the old directory path.

As long as those paths point to an actual file, it won't throw an error - but if you share that ArchRProject with someone else on a different PC, then suddenly those paths don't point anywhere, leading to an error.

It looks like you guys tried to fix error at somepoint that by gsub-ing for the old directory, but if a project was moved and saved before that code was added, and then moved a second time, then the original directory won't be gsubbed out.

The solution is that saveArchRProject needs to update the GroupCoverage metadata with update paths to the coverage files, and maybe remove metadata for Coverage files that don't exist?

Or is there a way we could have a 'rebootArchRProject' or 'dietArchRProject' - which would essentially clean up an ArchR Project of everything but the single cell metadata and TileMatrix? This could fix issues like this without needing to completely recreate the project from scratch.

Specifically, these paths are checked in 592 of R/AllClasses.R, where the ArchRProject is saved.

https://github.com/GreenleafLab/ArchR/blob/c61b0645d1482f80dcc24e25fbd915128c1b2500/R/AllClasses.R#L592-L593

The file path is stored here: ArchRProj@projectMetadata$GroupCoverages[[1]]$coverageMetadata$File

I re-ran group coverage, and then then manually removed all the GroupCoverage info from before - both in terms of paths info with the ArchRProject object, and actual files within the ArchR directory.

Here's to hoping that I didn't mess up something else by doing that.

markphillippebworth avatar Jan 10 '24 02:01 markphillippebworth