ArchR icon indicating copy to clipboard operation
ArchR copied to clipboard

How can I fix an error in the creation of Arrow Files from my fragment files?

Open Axbxh opened this issue 9 months ago • 3 comments

ArchR log file

ArchR-createArrows-73c43a8d3823-Date-2024-05-20_Time-18-57-29.805596.log

Description of the bug

While creating Arrow Files from a fragment_file_name.tsv.gz, I get an error ggplot for Fragment Size Distribution. The message says the following:

2024-05-20 19:16:46.377978 : (D865 : 1 of 2) Successful creation of Arrow File, 19.246 mins elapsed. 2024-05-20 19:16:47.42894 : (D865 : 1 of 2) Adding Fragment Summary, 19.267 mins elapsed. 2024-05-20 19:17:08.62645 : (D865 : 1 of 2) Plotting Fragment Size Distribution, 19.621 mins elapsed. 2024-05-20 19:17:10.105093 : Continuing through after error ggplot for Fragment Size Distribution, 19.645 mins elapsed. 2024-05-20 19:17:11.227721 : (D865 : 1 of 2) Computing TSS Enrichment Scores, 19.664 mins elapsed. 2024-05-20 19:18:25.869288 : (D865 : 1 of 2) Computed TSS Scores!, 1.244 mins elapsed.

2024-05-20 19:18:25.885971 : Detected 2 or less cells pass filter (Non-Zero median TSS = 0.94, median Frags = 39590) in file! Check inputs such as 'filterFrags' or 'filterTSS' to keep cells! Exiting!

2024-05-20 19:18:25.893817 : createArrowFiles has encountered an error, checking if any ArrowFiles completed..

------- Completed

End Time : 2024-05-20 19:18:26.010781 Elapsed Time Minutes = 20.9341327190399 Elapsed Time Hours = 0.348902928100692

Although the log message shown as "Successful creation of Arrow File", I do not find any Arrow files in my home directory. The output is three folders:

  1. ArchRLogs
  2. Fragment Size Distribution.pdf < SampleNames < QualityControl
  3. tmp which is empty

Code: To Reproduce

Code I used on Rstudio

library(ArchR)

fragmentFilePath <- '~/fragment_file_name.tsv.gz'

inputFiles <- c(fragmentFile = fragmentFilePath)
inputFiles

addArchRGenome("mm10")

work_dir <- "~/"
setwd(work_dir)

addArchRThreads(threads = 16) 

ArrowFiles <- createArrowFiles(
  inputFiles = inputFiles,
  sampleNames = names(inputFiles),
  minTSS = 2,
  minFrags = 0,
  maxFrags = 1e+07,
  addTileMat = TRUE,
  addGeneScoreMat = TRUE,
  offsetPlus = 0,
  offsetMinus = 0,
  force = TRUE, #not make a new arrow file if one already exists
  TileMatParams = list(tileSize = 5000)
)

ArrowFiles

Expected behavior

Creation of Arrow File: fragment_file_name.arrow, in the ArchR directory.

ArchR Tutorial Code Link: https://www.archrproject.com/bookdown/creating-arrow-files.html

library(ArchR)

inputFiles <- getTutorialData("Hematopoiesis")
inputFiles

1756 ATAC_BMMC_R1 “HemeFragments/scATAC_BMMC_R1.fragments.tsv.gz” scATAC_CD34_BMMC_R1 “HemeFragments/scATAC_CD34_BMMC_R1.fragments.tsv.gz” scATAC_PBMC_R1 “HemeFragments/scATAC_PBMC_R1.fragments.tsv.gz”

addArchRGenome("hg19")
addArchRThreads(threads = 16) 

Setting default genome to Hg19. Setting default number of Parallel threads to 16.

ArrowFiles <- createArrowFiles(
  inputFiles = inputFiles,
  sampleNames = names(inputFiles),
  filterTSS = 4, #Dont set this too high because you can always increase later
  filterFrags = 1000, 
  addTileMat = TRUE,
  addGeneScoreMat = TRUE
)

Using GeneAnnotation set by addArchRGenome(Hg19)! Using GeneAnnotation set by addArchRGenome(Hg19)! ArchR logging to : ArchRLogs/ArchR-createArrows-dfa159ddbf6e-Date-2020-04-15_Time-09-21-27.log If there is an issue, please report to github with logFile! Cleaning Temporary Files 2020-04-15 09:21:28 : Batch Execution w/ safelapply!, 0 mins elapsed. ArchR logging successful to : ArchRLogs/ArchR-createArrows-dfa159ddbf6e-Date-2020-04-15_Time-09-21-27.log

ArrowFiles

“scATAC_BMMC_R1.arrow” “scATAC_CD34_BMMC_R1.arrow” “scATAC_PBMC_R1.arrow”

Additional context

Windows specifications of my device: Edition: Windows 10 Home Version: 22H2 Installed on: ‎1/‎22/‎2021 OS build: 19045.4412 Experience: Windows Feature Experience Pack 1000.19056.1000.0 R version R 4.3.3

Axbxh avatar May 23 '24 22:05 Axbxh