ArchR icon indicating copy to clipboard operation
ArchR copied to clipboard

Error in subsetArchRProject: Error in loadArchRProject(path = outputDirectory) : all(file.exists(zfiles)) is not TRUE

Open YuZhengM opened this issue 2 years ago • 6 comments

I noticed that when I used the subsetArchRProject function, there was a problem about file.exists. The following is the detailed related content:

Describe the bug/Browse code information

> idxSample <- BiocGenerics::which(proj$Sample == name)
> cellsSample <- proj$cellNames[idxSample]
> proj_name <- subsetArchRProject(proj, cellsSample, outputDirectory=paste0("ArchRSubset_", name))
Copying ArchRProject to new outputDirectory : /mnt/f/software/scATAC_data/handler/GSE129785/ArchRSubset_GSE129785_scATAC-PBMCs-Frozen
Copying Arrow Files...
Getting ImputeWeights
No imputeWeights found, returning NULL
Copying Other Files...
Copying Other Files (1 of 19): Embeddings
Copying Other Files (2 of 19): GroupCoverages
Copying Other Files (3 of 19): GSE129785_scATAC-Hematopoiesis-All.tsv.gz
Copying Other Files (4 of 19): GSE129785_scATAC-Hematopoiesis-All.tsv.gz.tbi
Copying Other Files (5 of 19): GSE129785_scATAC-Hematopoiesis-CD34.tsv.gz
Copying Other Files (6 of 19): GSE129785_scATAC-Hematopoiesis-CD34.tsv.gz.tbi
Copying Other Files (7 of 19): GSE129785_scATAC-PBMCs-Fresh.tsv.gz
Copying Other Files (8 of 19): GSE129785_scATAC-PBMCs-Fresh.tsv.gz.tbi
Copying Other Files (9 of 19): GSE129785_scATAC-PBMCs-Frozen.tsv.gz
Copying Other Files (10 of 19): GSE129785_scATAC-PBMCs-Frozen.tsv.gz.tbi
Copying Other Files (11 of 19): GSE129785_scATAC-PBMCs-FrozenSort.tsv.gz
Copying Other Files (12 of 19): GSE129785_scATAC-PBMCs-FrozenSort.tsv.gz.tbi
Copying Other Files (13 of 19): GSE129785_scATAC-TME-All.tsv.gz
Copying Other Files (14 of 19): GSE129785_scATAC-TME-All.tsv.gz.tbi
Copying Other Files (15 of 19): GSE129785_scATAC-TME-TCells.tsv.gz
Copying Other Files (16 of 19): GSE129785_scATAC-TME-TCells.tsv.gz.tbi
Copying Other Files (17 of 19): IterativeLSI
Copying Other Files (18 of 19): PeakCalls
Copying Other Files (19 of 19): Plots
Saving ArchRProject...
Loading ArchRProject...
Error in loadArchRProject(path = outputDirectory) :
  all(file.exists(zfiles)) is not TRUE
>

By the way, I use the devtools::install_github("GreenleafLab/ArchR", ref="dev", repos = BiocManager::repositories()) command to install the ArchR package.

I looked at the code for this function and found that there was a problem with this section of zfiles <- gsub(outputDir, outputDirNew, zdata$File). I feel that it is likely a problem with path concatenation.

  if (length(ArchRProj@projectMetadata$GroupCoverages) > 0) {
    groupC <- length(ArchRProj@projectMetadata$GroupCoverages)
    for (z in seq_len(groupC)) {
      zdata <- ArchRProj@projectMetadata$GroupCoverages[[z]]$coverageMetadata
      zfiles <- gsub(outputDir, outputDirNew, zdata$File)
      ArchRProj@projectMetadata$GroupCoverages[[z]]$coverageMetadata$File <- zfiles
      stopifnot(all(file.exists(zfiles)))
    }
  }

I tried running every piece of code for this function and found that the value in the variable zfiles is different from the value in my actual path.

> zfiles
 [1] "/mnt/f/software/scATAC_data/handler/GSE129785/ArchRSubset_GSE129785_scATAC-PBMCs-FrozenSubset_GSE129785_scATAC-PBMCs-Frozen/GroupCoverages/cell_type/Cluster0._.GSE129785_scATAC.PBMCs.Fresh.insertions.coverage.h5"
 [2] "/mnt/f/software/scATAC_data/handler/GSE129785/ArchRSubset_GSE129785_scATAC-PBMCs-FrozenSubset_GSE129785_scATAC-PBMCs-Frozen/GroupCoverages/cell_type/Cluster0._.GSE129785_scATAC.PBMCs.Frozen.insertions.coverage.h5"
 [3] "/mnt/f/software/scATAC_data/handler/GSE129785/ArchRSubset_GSE129785_scATAC-PBMCs-FrozenSubset_GSE129785_scATAC-PBMCs-Frozen/GroupCoverages/cell_type/Cluster1._.GSE129785_scATAC.Hematopoiesis.CD34.insertions.coverage.h5"
 [4] "/mnt/f/software/scATAC_data/handler/GSE129785/ArchRSubset_GSE129785_scATAC-PBMCs-FrozenSubset_GSE129785_scATAC-PBMCs-Frozen/GroupCoverages/cell_type/Cluster1._.GSE129785_scATAC.Hematopoiesis.All.insertions.coverage.h5"
 .........

The actual path of my files:

/mnt/f/software/scATAC_data/handler/GSE129785/ArchRSubset_GSE129785_scATAC-PBMCs-Frozen/GroupCoverages/cell_type/Cluster0._.GSE129785_scATAC.PBMCs.Fresh.insertions.coverage.h5
.......

I hope to get your answer, Thank you very much!

YuZhengM avatar Apr 16 '23 04:04 YuZhengM

Hi @YuZhengM! Thanks for using ArchR! Please make sure that your post belongs in the Issues section. Only bugs and error reports belong in the Issues section. Usage questions and feature requests should be posted in the Discussions section, not in Issues.
It is worth noting that there are very few actual bugs in ArchR. If you are getting an error, it is probably something specific to your dataset, usage, or computational environment, all of which are extremely challenging to troubleshoot. As such, we require reproducible examples (preferably using the tutorial dataset) from users who want assistance. If you cannot reproduce your error, how will we be able to help? Before going through the work of making a reproducible example, search the previous Issues, Discussions, function definitions, or the ArchR manual and you will likely find the answers you are looking for. If your post does not contain a reproducible example, it is unlikely to receive a response.
In addition to a reproducible example, you must respond to the following questions before we help you, unless your original post already contained this information: 1. If you've encountered an error, have you already searched previous Issues to make sure that this hasn't already been solved? 2. Can you recapitulate your error using the tutorial code and dataset? If so, provide a reproducible example. 3. Did you post your log file? If not, add it now. 4. Remove any screenshots that contain text and instead copy and paste the text using markdown's codeblock syntax (three consecutive backticks). You can do this by editing your original post.

rcorces avatar Apr 16 '23 04:04 rcorces

By the way, I found that the variable zdata$File is the actual path of my files.

> zdata$File
 [1] "/mnt/f/software/scATAC_data/handler/GSE129785/ArchRSubset_GSE129785_scATAC-PBMCs-Frozen/GroupCoverages/cell_type/Cluster0._.GSE129785_scATAC.PBMCs.Fresh.insertions.coverage.h5"
 [2] "/mnt/f/software/scATAC_data/handler/GSE129785/ArchRSubset_GSE129785_scATAC-PBMCs-Frozen/GroupCoverages/cell_type/Cluster0._.GSE129785_scATAC.PBMCs.Frozen.insertions.coverage.h5"
 [3] "/mnt/f/software/scATAC_data/handler/GSE129785/ArchRSubset_GSE129785_scATAC-PBMCs-Frozen/GroupCoverages/cell_type/Cluster1._.GSE129785_scATAC.Hematopoiesis.CD34.insertions.coverage.h5"
 [4] "/mnt/f/software/scATAC_data/handler/GSE129785/ArchRSubset_GSE129785_scATAC-PBMCs-Frozen/GroupCoverages/cell_type/Cluster1._.GSE129785_scATAC.Hematopoiesis.All.insertions.coverage.h5"
 ......

YuZhengM avatar Apr 16 '23 04:04 YuZhengM

thanks for posting. I'm currently on vacation but will look into this when I return

rcorces avatar Apr 19 '23 11:04 rcorces

Thank you for your reply. Wishing you a pleasant vacation!

YuZhengM avatar Apr 19 '23 11:04 YuZhengM

Hi,boy, i also encountered this problem ago, you can used the function "ArchR::.safeSaveRDS" saving the "rds", ps. 他保存的这个函数写的不太合理,不应该有次判断

schnappi-wkl avatar May 16 '23 10:05 schnappi-wkl

I ran in to the same error- Error in loadArchRProject(path = outputDirectory) : all(file.exists(zfiles)) is not TRUE

I figured out the problem I was having, and it was caused by me. The problem was that the Groupcoverages link was not changed yet. Since I am using ArchR and my datasets across multiple computers, I have the git-projects in my Box Drive. Because ArchR uses absolute directory links, I have to adjust the object after loading it on the respective computer. I know Ryan has repeatedly written not to mess with the directories - probably a good tip.

My code to change all the directories in the project, i am using the 'here' package to not have to adjust the code for each computer, I had to adjust it for Windows since we need double backslashes - I will update the post if I run into problems further downstream.

After fixing all the links subsetArchRProject ran smoothly

#for MAC/Linux proj@sampleColData@listData[["ArrowFiles"]] <- gsub(dirname(proj@sampleColData@listData[["ArrowFiles"]]),here("ATAC","ArchRSubset","ArrowFiles"),proj@sampleColData@listData[["ArrowFiles"]])

proj@projectMetadata@listData[["outputDirectory"]] <- gsub( proj@projectMetadata@listData[["outputDirectory"]],here("ATAC","ArchRSubset"),proj@projectMetadata@listData[["outputDirectory"]])

proj@imputeWeights@listData[["Weights"]]@listData[["w1"]] <- gsub(dirname(proj@imputeWeights@listData[["Weights"]]@listData[["w1"]]),here("ATAC","ArchRSubset","ArrowFiles","ImputeWeights","Impute-Weights-Rep-1"),proj@imputeWeights@listData[["Weights"]]@listData[["w1"]])

proj@imputeWeights@listData[["Weights"]]@listData[["w2"]] <- gsub(dirname(proj@imputeWeights@listData[["Weights"]]@listData[["w2"]]),here("ATAC","ArchRSubset","ArrowFiles","ImputeWeights","Impute-Weights-Rep-2"),proj@imputeWeights@listData[["Weights"]]@listData[["w2"]])

proj@projectMetadata@listData[["GroupCoverages"]]@listData[["ClusterByGroup"]]@listData[["coverageMetadata"]]@listData[["File"]] <- gsub(dirname(proj@projectMetadata@listData[["GroupCoverages"]]@listData[["ClusterByGroup"]]@listData[["coverageMetadata"]]@listData[["File"]]),here("ATAC","ArchRSubset","GroupCoverages","ClusterByGroup"),proj@projectMetadata@listData[["GroupCoverages"]]@listData[["ClusterByGroup"]]@listData[["coverageMetadata"]]@listData[["File"]])

proj@peakAnnotation@listData[["Motif"]][["Positions"]] <- gsub(dirname(proj@peakAnnotation@listData[["Motif"]][["Positions"]]),here("ATAC","ArchRSubset","Annotations"),proj@peakAnnotation@listData[["Motif"]][["Positions"]])

proj@peakAnnotation@listData[["Motif"]][["Matches"]] <- gsub(dirname(proj@peakAnnotation@listData[["Motif"]][["Matches"]]),here("ATAC","ArchRSubset","Annotations"),proj@peakAnnotation@listData[["Motif"]][["Matches"]])

proj@peakSet@metadata[["bgdPeaks"]] <- gsub(dirname(proj@peakSet@metadata[["bgdPeaks"]]),here("ATAC","ArchRSubset"),proj@peakSet@metadata[["bgdPeaks"]])

###for Windows proj@sampleColData@listData[["ArrowFiles"]] <- gsub("/", "\", gsub(dirname(proj@sampleColData@listData[["ArrowFiles"]]),here("ATAC","ArchRSubset","ArrowFiles"),proj@sampleColData@listData[["ArrowFiles"]]), fixed = TRUE)

proj@projectMetadata@listData[["outputDirectory"]] <- gsub("/", "\", gsub(proj@projectMetadata@listData[["outputDirectory"]],here("ATAC","ArchRSubset"),proj@projectMetadata@listData[["outputDirectory"]]), fixed = TRUE)

proj@imputeWeights@listData[["Weights"]]@listData[["w1"]] <- gsub("/", "\", gsub(dirname(proj@imputeWeights@listData[["Weights"]]@listData[["w1"]]),here("ATAC","ArchRSubset","ArrowFiles","ImputeWeights","Impute-Weights-Rep-1"),proj@imputeWeights@listData[["Weights"]]@listData[["w1"]]), fixed = TRUE)

proj@imputeWeights@listData[["Weights"]]@listData[["w2"]] <- gsub("/", "\", gsub(dirname(proj@imputeWeights@listData[["Weights"]]@listData[["w2"]]),here("ATAC","ArchRSubset","ArrowFiles","ImputeWeights","Impute-Weights-Rep-2"),proj@imputeWeights@listData[["Weights"]]@listData[["w2"]]), fixed = TRUE)

proj@projectMetadata@listData[["GroupCoverages"]]@listData[["ClusterByGroup"]]@listData[["coverageMetadata"]]@listData[["File"]] <- gsub("/", "\", gsub(dirname(proj@projectMetadata@listData[["GroupCoverages"]]@listData[["ClusterByGroup"]]@listData[["coverageMetadata"]]@listData[["File"]]),here("ATAC","ArchRSubset","GroupCoverages","ClusterByGroup"),proj@projectMetadata@listData[["GroupCoverages"]]@listData[["ClusterByGroup"]]@listData[["coverageMetadata"]]@listData[["File"]]), fixed = TRUE)

proj@peakAnnotation@listData[["Motif"]][["Positions"]] <- gsub("/", "\", gsub(dirname(proj@peakAnnotation@listData[["Motif"]][["Positions"]]),here("ATAC","ArchRSubset","Annotations"),proj@peakAnnotation@listData[["Motif"]][["Positions"]]), fixed = TRUE)

proj@peakAnnotation@listData[["Motif"]][["Matches"]] <- gsub("/", "\", gsub(dirname(proj@peakAnnotation@listData[["Motif"]][["Matches"]]),here("ATAC","ArchRSubset","Annotations"),proj@peakAnnotation@listData[["Motif"]][["Matches"]]), fixed = TRUE)

proj@peakSet@metadata[["bgdPeaks"]] <- gsub("/", "\", gsub(dirname(proj@peakSet@metadata[["bgdPeaks"]]),here("ATAC","ArchRSubset"),proj@peakSet@metadata[["bgdPeaks"]]), fixed = TRUE)

eliascrapa avatar May 27 '23 07:05 eliascrapa