dada2
dada2 copied to clipboard
Error in readBin(con, raw(), n) : error reading from the connection
Hi Ben, I got a problem when I was using the DADA2 workflow for Big Data: Paired-end. Everything went well until sample inference and merger of paired-end reads. I used the code:
for(sam in sample.names) { cat("Processing:", sam, "\n") derepF <- derepFastq(filtFs[[sam]]) ddF <- dada(derepF, err=errF, multithread=FALSE) derepR <- derepFastq(filtRs[[sam]]) ddR <- dada(derepR, err=errR, multithread=FALSE) merger <- mergePairs(ddF, derepF, ddR, derepR) mergers[[sam]] <- merger }
It went well for the first several samples, and then I got the error:
Error in readBin(con, raw(), n) : error reading from the connection
My colleague and I both got the error when dealing with the same data but the error happened while dealing with different individual sample. I have searched in the issues but it seems no one met such a problem? Is it associated with my data or the code?
Thank you very much.
This error is happening in code being called by DADA2, but that is implemented outside the DADA2 package. There is some issue with being able to open one of the files you are providing.
The best way to diagnose this would be to try to produce a "minimal example" in which you can reproduce this error using a specific filename (rather than a vector of filenames) and a single command.
Thank you, I will have a try.
Hello! I hve the same problem, when I use mergePairs, the merging starts but crashes in the middle with the error:
"Error in readBin(con, raw(), n) : error reading from the connection"
Weirdly, this happens at a different sample every time I try. I have also tried and made sure that all of my filtFs and filtRs paths are openable and that all dadaFs and dadaRs have the same samples in same order as filtFs and filtRs. I also tried with both RStudio and just R. Nothing seems to help. It seems indeed that there is some problem in opening a file, but since its not about the specific fastq file, I'm at loss as to how to fix this. Any thoughts?
Here's my code:
library(dada2)
#READ IN
#dada objects dadaFs<-readRDS("dadaFs.rds") dadaRs<-readRDS("dadaRs.rds")
#file paths filtFs<-readRDS("filtFs.rds") filtRs<-readRDS("filtRs.rds")
#MERGE
mergers <- mergePairs(dadaFs, filtFs, dadaRs, filtRs, verbose=TRUE)
And here's my session Info:
R version 4.2.3 (2023-03-15) Platform: aarch64-apple-darwin20 (64-bit) Running under: macOS Monterey 12.3.1
Matrix products: default BLAS: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRblas.0.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib
locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] dada2_1.26.0 Rcpp_1.0.10
loaded via a namespace (and not attached):
[1] SummarizedExperiment_1.28.0 tidyselect_1.2.0
[3] reshape2_1.4.4 lattice_0.21-8
[5] colorspace_2.1-0 vctrs_0.6.2
[7] generics_0.1.3 stats4_4.2.3
[9] utf8_1.2.3 rlang_1.1.1
[11] pillar_1.9.0 DBI_1.1.3
[13] glue_1.6.2 BiocParallel_1.32.6
[15] BiocGenerics_0.44.0 RColorBrewer_1.1-3
[17] matrixStats_0.63.0 jpeg_0.1-10
[19] GenomeInfoDbData_1.2.9 lifecycle_1.0.3
[21] plyr_1.8.8 stringr_1.5.0
[23] zlibbioc_1.44.0 MatrixGenerics_1.10.0
[25] Biostrings_2.66.0 munsell_0.5.0
[27] gtable_0.3.3 hwriter_1.3.2.1
[29] codetools_0.2-19 latticeExtra_0.6-30
[31] Biobase_2.58.0 IRanges_2.32.0
[33] GenomeInfoDb_1.34.9 parallel_4.2.3
[35] fansi_1.0.4 scales_1.2.1
[37] DelayedArray_0.24.0 S4Vectors_0.36.2
[39] RcppParallel_5.1.7 XVector_0.38.0
[41] ShortRead_1.56.1 deldir_1.0-9
[43] interp_1.1-4 Rsamtools_2.14.0
[45] ggplot2_3.4.4 png_0.1-8
[47] stringi_1.7.12 dplyr_1.1.2
[49] GenomicRanges_1.50.2 grid_4.2.3
[51] cli_3.6.1 tools_4.2.3
[53] bitops_1.0-7 magrittr_2.0.3
[55] RCurl_1.98-1.12 tibble_3.2.1
[57] crayon_1.5.2 pkgconfig_2.0.3
[59] Matrix_1.5-4 R6_2.5.1
[61] GenomicAlignments_1.34.1 compiler_4.2.3
#file paths filtFs<-readRDS("filtFs.rds") filtRs<-readRDS("filtRs.rds")
This is potentially problematic. How stable are the file paths between when you saved these and now? If there is any discrepancy, errors like the one you are seeing are possible.
I made a loop that calls each path with readFastq() (from package Shortread) and checks it is openable. All of the paths seem fine in that the files open OK.
I had this same error when calling plotQualityProfile
with a vector of file names:
Error: BiocParallel errors
1 remote errors, element index: 1
0 unevaluated and other errors
first remote error: error in evaluating the argument 'dirPath' in selecting a method for function 'qa': error reading from the connection
I isolated the gzip file that was crashing it and tested the file with gzip -t
. Corrupted gzip file cause this problem in my case.