ArchR
ArchR copied to clipboard
How can I fix an error in the creation of Arrow Files from my fragment files?
ArchR log file
ArchR-createArrows-73c43a8d3823-Date-2024-05-20_Time-18-57-29.805596.log
Description of the bug
While creating Arrow Files from a fragment_file_name.tsv.gz, I get an error ggplot for Fragment Size Distribution. The message says the following:
2024-05-20 19:16:46.377978 : (D865 : 1 of 2) Successful creation of Arrow File, 19.246 mins elapsed. 2024-05-20 19:16:47.42894 : (D865 : 1 of 2) Adding Fragment Summary, 19.267 mins elapsed. 2024-05-20 19:17:08.62645 : (D865 : 1 of 2) Plotting Fragment Size Distribution, 19.621 mins elapsed. 2024-05-20 19:17:10.105093 : Continuing through after error ggplot for Fragment Size Distribution, 19.645 mins elapsed. 2024-05-20 19:17:11.227721 : (D865 : 1 of 2) Computing TSS Enrichment Scores, 19.664 mins elapsed. 2024-05-20 19:18:25.869288 : (D865 : 1 of 2) Computed TSS Scores!, 1.244 mins elapsed.
2024-05-20 19:18:25.885971 : Detected 2 or less cells pass filter (Non-Zero median TSS = 0.94, median Frags = 39590) in file! Check inputs such as 'filterFrags' or 'filterTSS' to keep cells! Exiting!
2024-05-20 19:18:25.893817 : createArrowFiles has encountered an error, checking if any ArrowFiles completed..
------- Completed
End Time : 2024-05-20 19:18:26.010781 Elapsed Time Minutes = 20.9341327190399 Elapsed Time Hours = 0.348902928100692
Although the log message shown as "Successful creation of Arrow File", I do not find any Arrow files in my home directory. The output is three folders:
- ArchRLogs
- Fragment Size Distribution.pdf < SampleNames < QualityControl
- tmp which is empty
Code: To Reproduce
Code I used on Rstudio
library(ArchR)
fragmentFilePath <- '~/fragment_file_name.tsv.gz'
inputFiles <- c(fragmentFile = fragmentFilePath)
inputFiles
addArchRGenome("mm10")
work_dir <- "~/"
setwd(work_dir)
addArchRThreads(threads = 16)
ArrowFiles <- createArrowFiles(
inputFiles = inputFiles,
sampleNames = names(inputFiles),
minTSS = 2,
minFrags = 0,
maxFrags = 1e+07,
addTileMat = TRUE,
addGeneScoreMat = TRUE,
offsetPlus = 0,
offsetMinus = 0,
force = TRUE, #not make a new arrow file if one already exists
TileMatParams = list(tileSize = 5000)
)
ArrowFiles
Expected behavior
Creation of Arrow File: fragment_file_name.arrow, in the ArchR directory.
ArchR Tutorial Code Link: https://www.archrproject.com/bookdown/creating-arrow-files.html
library(ArchR)
inputFiles <- getTutorialData("Hematopoiesis")
inputFiles
1756 ATAC_BMMC_R1 “HemeFragments/scATAC_BMMC_R1.fragments.tsv.gz” scATAC_CD34_BMMC_R1 “HemeFragments/scATAC_CD34_BMMC_R1.fragments.tsv.gz” scATAC_PBMC_R1 “HemeFragments/scATAC_PBMC_R1.fragments.tsv.gz”
addArchRGenome("hg19")
addArchRThreads(threads = 16)
Setting default genome to Hg19. Setting default number of Parallel threads to 16.
ArrowFiles <- createArrowFiles(
inputFiles = inputFiles,
sampleNames = names(inputFiles),
filterTSS = 4, #Dont set this too high because you can always increase later
filterFrags = 1000,
addTileMat = TRUE,
addGeneScoreMat = TRUE
)
Using GeneAnnotation set by addArchRGenome(Hg19)! Using GeneAnnotation set by addArchRGenome(Hg19)! ArchR logging to : ArchRLogs/ArchR-createArrows-dfa159ddbf6e-Date-2020-04-15_Time-09-21-27.log If there is an issue, please report to github with logFile! Cleaning Temporary Files 2020-04-15 09:21:28 : Batch Execution w/ safelapply!, 0 mins elapsed. ArchR logging successful to : ArchRLogs/ArchR-createArrows-dfa159ddbf6e-Date-2020-04-15_Time-09-21-27.log
ArrowFiles
“scATAC_BMMC_R1.arrow” “scATAC_CD34_BMMC_R1.arrow” “scATAC_PBMC_R1.arrow”
Additional context
Windows specifications of my device: Edition: Windows 10 Home Version: 22H2 Installed on: 1/22/2021 OS build: 19045.4412 Experience: Windows Feature Experience Pack 1000.19056.1000.0 R version R 4.3.3