ArchR icon indicating copy to clipboard operation
ArchR copied to clipboard

Calling peaks with MACS2 segmentation fault error

Open allwright1 opened this issue 1 year ago • 2 comments

I've been completing the given hematopoiesis tutorial as seen here: https://www.archrproject.com/bookdown/index.html#section Is there any insight as to what may be causing these errors?

Attach your log file

ArchR-addReproduciblePeakSet-3fd52e100b114a-Date-2022-08-10_Time-21-30-43.log

Describe the bug Segmentation fault error when using MACS2 package for the addReproduciblePeakSet() function as described in the hematopoiesis tutorial.

To Reproduce I have been following the hematopoiesis tutorial exactly, copying and pasting each given example code into my SSH terminal as described here: https://www.archrproject.com/bookdown/calling-peaks-w-macs2.html I edited pathToMacs2 to include my specific file path to Macs2 within my conda environment like so: projHeme4 <- addReproduciblePeakSet( ArchRProj = projHeme4, groupBy = "Clusters2", pathToMacs2 = "/gpfs/home/wrightlaudrey_gmail_com/anaconda3/envs/chip_seq/bin/macs2" )

Expected behavior addReproduciblePeakSet() runs successfully as seen in given tutorial to use getPeakSet() function.

Session Info R version 4.1.1 (2021-08-10) Platform: x86_64-conda-linux-gnu (64-bit) Running under: CentOS Linux 8

Matrix products: default BLAS/LAPACK: /gpfs/home/wrightlaudrey_gmail_com/anaconda3/envs/chip_ seq/lib/libopenblasp-r0.3.17.so

Random number generation: RNG: L'Ecuyer-CMRG Normal: Inversion Sample: Rejection

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] parallel stats4 grid stats graphics grDevices uti ls [8] datasets methods base

other attached packages: [1] BSgenome.Hsapiens.UCSC.hg19_1.4.3 BSgenome_1.62.0
[3] rtracklayer_1.54.0 Biostrings_2.62.0
[5] XVector_0.34.0 rhdf5_2.38.1
[7] SummarizedExperiment_1.24.0 Biobase_2.54.0
[9] MatrixGenerics_1.6.0 Rcpp_1.0.9
[11] Matrix_1.4-1 GenomicRanges_1.46.1
[13] GenomeInfoDb_1.30.1 IRanges_2.28.0
[15] S4Vectors_0.32.4 BiocGenerics_0.40.0
[17] matrixStats_0.62.0 data.table_1.14.2
[19] stringr_1.4.0 plyr_1.8.7
[21] magrittr_2.0.3 ggplot2_3.3.6
[23] gtable_0.3.0 gtools_3.9.3
[25] gridExtra_2.3 ArchR_1.0.2

loaded via a namespace (and not attached): [1] tidyselect_1.1.2 purrr_0.3.4 lattice_0.20- 45 [4] colorspace_2.0-3 vctrs_0.4.1 generics_0.1. 3 [7] yaml_2.3.5 utf8_1.2.2 XML_3.99-0.10
[10] rlang_1.0.4 pillar_1.8.0 glue_1.6.2
[13] withr_2.5.0 DBI_1.1.3 BiocParallel_ 1.28.3 [16] GenomeInfoDbData_1.2.7 lifecycle_1.0.1 zlibbioc_1.40 .0 [19] munsell_0.5.0 restfulr_0.0.15 Cairo_1.6-0
[22] fansi_1.0.3 scales_1.2.0 DelayedArray_ 0.20.0 [25] Rsamtools_2.10.0 rjson_0.2.21 stringi_1.7.8
[28] dplyr_1.0.9 BiocIO_1.4.0 cli_3.3.0
[31] tools_4.1.1 bitops_1.0-7 rhdf5filters_ 1.6.0 [34] RCurl_1.98-1.7 tibble_3.1.8 crayon_1.5.1
[37] pkgconfig_2.0.3 assertthat_0.2.1 Rhdf5lib_1.16 .0 [40] R6_2.5.1 GenomicAlignments_1.30.0 compiler_4.1.

Additional Info Here is the error output:

ArchR logging to : ArchRLogs/ArchR-addReproduciblePeakSet-3fd52e100b114a-Date-202 2-08-10_Time-21-30-43.log If there is an issue, please report to github with logFile! 2022-08-10 21:30:43 : Peak Calling Parameters!, 0.004 mins elapsed. Group nCells nCellsUsed nReplicates nMin nMax maxPeaks B B 427 422 2 168 254 150000 CD4.M CD4.M 757 595 2 95 500 150000 CD4.N CD4.N 1251 548 2 48 500 150000 CLP CLP 387 387 2 86 301 150000 Erythroid Erythroid 872 722 2 222 500 150000 GMP GMP 1097 769 2 269 500 150000 Mono Mono 2645 1000 2 500 500 150000 NK NK 767 767 2 308 459 150000 pDC pDC 308 299 2 148 151 149500 PreB PreB 355 355 2 40 315 150000 Progenitor Progenitor 1384 646 2 146 500 150000 2022-08-10 21:30:43 : Batching Peak Calls!, 0.005 mins elapsed. 2022-08-10 21:30:43 : Batch Execution w/ safelapply!, 0 mins elapsed. 2022-08-10 21:30:43 : Group 1 of 22, Calling Peaks with MACS2!, 0.001 mins elapsed. 2022-08-10 21:30:43 : Group 2 of 22, Calling Peaks with MACS2!, 0.001 mins elapsed. 2022-08-10 21:30:43 : Group 3 of 22, Calling Peaks with MACS2!, 0.001 mins elapsed. 2022-08-10 21:30:43 : Group 4 of 22, Calling Peaks with MACS2!, 0.002 mins elapsed. 2022-08-10 21:30:43 : Group 5 of 22, Calling Peaks with MACS2!, 0.002 mins elapsed. 2022-08-10 21:30:43 : Group 6 of 22, Calling Peaks with MACS2!, 0.002 mins elapsed. 2022-08-10 21:30:43 : Group 7 of 22, Calling Peaks with MACS2!, 0.003 mins elapsed. 2022-08-10 21:30:43 : Group 8 of 22, Calling Peaks with MACS2!, 0.003 mins elapsed. 2022-08-10 21:30:43 : Group 9 of 22, Calling Peaks with MACS2!, 0.003 mins elapsed. 2022-08-10 21:30:43 : Group 10 of 22, Calling Peaks with MACS2!, 0.004 mins elapsed. 2022-08-10 21:30:44 : Group 11 of 22, Calling Peaks with MACS2!, 0.004 mins elapsed. 2022-08-10 21:30:44 : Group 12 of 22, Calling Peaks with MACS2!, 0.005 mins elapsed. 2022-08-10 21:30:50 : Group 13 of 22, Calling Peaks with MACS2!, 0.104 mins elapsed. 2022-08-10 21:30:50 : Group 14 of 22, Calling Peaks with MACS2!, 0.115 mins elapsed. 2022-08-10 21:30:51 : Group 15 of 22, Calling Peaks with MACS2!, 0.136 mins elapsed. 2022-08-10 21:30:52 : Group 16 of 22, Calling Peaks with MACS2!, 0.142 mins elapsed. 2022-08-10 21:30:53 : Group 17 of 22, Calling Peaks with MACS2!, 0.162 mins elapsed. 2022-08-10 21:30:57 : Group 18 of 22, Calling Peaks with MACS2!, 0.228 mins elapsed. sh: fork: retry: Resource temporarily unavailable 2022-08-10 21:30:57 : Group 19 of 22, Calling Peaks with MACS2!, 0.232 mins elapsed. 2022-08-10 21:30:58 : Group 20 of 22, Calling Peaks with MACS2!, 0.238 mins elapsed. 2022-08-10 21:30:58 : Group 21 of 22, Calling Peaks with MACS2!, 0.251 mins elapsed. 2022-08-10 21:31:01 : Group 22 of 22, Calling Peaks with MACS2!, 0.292 mins elapsed. sh: line 1: 4184962 Segmentation fault (core dumped) '/gpfs/home/wrightlaudrey_gmail_com/anaconda3/envs/chip_seq/bin/macs2' callpeak -g 2.7e+09 --name PreB.il_com/Save-ProjHeme4/PeakCalls/InsertionBeds --format BED --call-summits --keep- dup all --nomodel --nolambda --shift -75 --extsize 150 -q 0.1 > /dev/null 2> /dev /null sh: line 1: 4184999 Segmentation fault (core dumped) '/gpfs/home/wrightlaudrey_gmail_com/anaconda3/envs/chip_seq/bin/macs2' callpeak -g 2.7e+09 --name Progenitor..scATAC_BMMC_R1-22 --treatment /gpfs/home/wrightlaudrey_gmail_com/Save-ProjHeme4/PeakCalls/InsertionBeds/Progenitor..scATAC_BMMC_R1-22.insertions.bed --outdir /gpfs/home/wrightlaudrey_gmail_com/Save-ProjHeme4/PeakCalls/InsertionBeds --format BED --call-summits --keep-dup all --nomodel --nolambda --shift -75 --extsize 150 -q 0.1 > /dev/null 2> /dev/null sh: line 1: 4185001 Segmentation fault (core dumped) '/gpfs/home/wrightlaudrey_gmail_com/anaconda3/envs/chip_seq/bin/macs2' callpeak -g 2.7e+09 --name Progenitor..scATAC_CD34_BMMC_R1-21 --treatment /gpfs/home/wrightlaudrey_gmail_com/Save-ProjHeme4/PeakCalls/InsertionBeds/Progenitor..scATAC_CD34_BMMC_R1-21.insertions.bed --outdir /gpfs/home/wrightlaudrey_gmail_com/Save-ProjHeme4/PeakCalls/InsertionBeds --format BED --call-summits --keep-dup all --nomodel --nolambda --shift -75 --extsize 150 -q 0.1 > /dev/null 2> /dev/null Error in (function (..., threads = 1, preschedule = FALSE) : Error Found Iteration 3 : [1] "Error in data.table::fread(summitsFile, select = c(1, 2, 3, 5)) : \n File '/gpfs/home/wrightlaudrey_gmail_com/Save-ProjHeme4/PeakCalls/InsertionBeds/CD4.M..scATAC_PBMC_R1-3_summits.bed' does not exist or is non-readable. getwd()=='/gpfs/home/w <simpleError in data.table::fread(summitsFile, select = c(1, 2, 3, 5)): File '/gpfs/home/wrightlaudrey_gmail_com/Save-ProjHeme4/PeakCalls/InsertionBeds/CD4.M..scATAC_PBMC_R1-3_summits.bed' does not exist or is non-readable. getwd()=='/gpfs/home/wrig Error Found Iteration 7 : [1] "Error in data.table::fread(summitsFile, select = c(1, 2, 3, 5)) : \n File '/gpfs/home/wrightlaudrey_gmail_com/Save-ProjHeme4/PeakCalls/InsertionBeds/CLP..scATAC_CD34_BMMC_R1-7_summits.bed' does not exist or is non-readable. getwd()=='/gpfs/hom <simpleError in data.table::fread(summitsFile, select = c(1, 2, 3, 5)): File '/gpfs/home/wrightlaudrey_gmail_com/Save-ProjHeme4/PeakCalls/InsertionBeds/CLP._.scATAC_CD34_BMMC_R1-7 In addition: Warning message: In mclapply(..., mc.cores = threads, mc.preschedule = preschedule) : 15 function calls resulted in an error

allwright1 avatar Aug 11 '22 02:08 allwright1

Hi @allwright1! Thanks for using ArchR! Please make sure that your post belongs in the Issues section. Only bugs and error reports belong in the Issues section. Usage questions and feature requests should be posted in the Discussions section, not in Issues.
Before we help you, you must respond to the following questions unless your original post already contained this information: 1. If you've encountered an error, have you already searched previous Issues to make sure that this hasn't already been solved? 2. Can you recapitulate your error using the tutorial code and dataset? If so, provide a reproducible example. 3. Did you post your log file? If not, add it now. 4. Remove any screenshots that contain text and instead copy and paste the text using markdown's codeblock syntax (three consecutive backticks). You can do this by editing your original post.

rcorces avatar Aug 11 '22 02:08 rcorces

This doesnt look like a problem with ArchR to me. The first errors you are getting appear to be related to insufficient resources in your compute environment:

sh: fork: retry: Resource temporarily unavailable

and

sh: line 1: 4184962 Segmentation fault (core dumped)

It seems like something is causing MACS2 to fail, maybe memory limits.

rcorces avatar Aug 11 '22 04:08 rcorces