dada2 icon indicating copy to clipboard operation
dada2 copied to clipboard

Error in readBin(con, raw(), n) : error reading from the connection

Open ninglinwang opened this issue 2 years ago • 6 comments

Hi Ben, I got a problem when I was using the DADA2 workflow for Big Data: Paired-end. Everything went well until sample inference and merger of paired-end reads. I used the code:

for(sam in sample.names) { cat("Processing:", sam, "\n") derepF <- derepFastq(filtFs[[sam]]) ddF <- dada(derepF, err=errF, multithread=FALSE) derepR <- derepFastq(filtRs[[sam]]) ddR <- dada(derepR, err=errR, multithread=FALSE) merger <- mergePairs(ddF, derepF, ddR, derepR) mergers[[sam]] <- merger }

It went well for the first several samples, and then I got the error:

Error in readBin(con, raw(), n) : error reading from the connection

My colleague and I both got the error when dealing with the same data but the error happened while dealing with different individual sample. I have searched in the issues but it seems no one met such a problem? Is it associated with my data or the code?

Thank you very much.

ninglinwang avatar Mar 31 '22 02:03 ninglinwang

This error is happening in code being called by DADA2, but that is implemented outside the DADA2 package. There is some issue with being able to open one of the files you are providing.

The best way to diagnose this would be to try to produce a "minimal example" in which you can reproduce this error using a specific filename (rather than a vector of filenames) and a single command.

benjjneb avatar Apr 06 '22 00:04 benjjneb

Thank you, I will have a try.

ninglinwang avatar Apr 08 '22 02:04 ninglinwang

Hello! I hve the same problem, when I use mergePairs, the merging starts but crashes in the middle with the error:

"Error in readBin(con, raw(), n) : error reading from the connection"

Weirdly, this happens at a different sample every time I try. I have also tried and made sure that all of my filtFs and filtRs paths are openable and that all dadaFs and dadaRs have the same samples in same order as filtFs and filtRs. I also tried with both RStudio and just R. Nothing seems to help. It seems indeed that there is some problem in opening a file, but since its not about the specific fastq file, I'm at loss as to how to fix this. Any thoughts?

Here's my code:

library(dada2)

#READ IN

#dada objects dadaFs<-readRDS("dadaFs.rds") dadaRs<-readRDS("dadaRs.rds")

#file paths filtFs<-readRDS("filtFs.rds") filtRs<-readRDS("filtRs.rds")

#MERGE

mergers <- mergePairs(dadaFs, filtFs, dadaRs, filtRs, verbose=TRUE)

And here's my session Info:

R version 4.2.3 (2023-03-15) Platform: aarch64-apple-darwin20 (64-bit) Running under: macOS Monterey 12.3.1

Matrix products: default BLAS: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRblas.0.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] dada2_1.26.0 Rcpp_1.0.10

loaded via a namespace (and not attached): [1] SummarizedExperiment_1.28.0 tidyselect_1.2.0
[3] reshape2_1.4.4 lattice_0.21-8
[5] colorspace_2.1-0 vctrs_0.6.2
[7] generics_0.1.3 stats4_4.2.3
[9] utf8_1.2.3 rlang_1.1.1
[11] pillar_1.9.0 DBI_1.1.3
[13] glue_1.6.2 BiocParallel_1.32.6
[15] BiocGenerics_0.44.0 RColorBrewer_1.1-3
[17] matrixStats_0.63.0 jpeg_0.1-10
[19] GenomeInfoDbData_1.2.9 lifecycle_1.0.3
[21] plyr_1.8.8 stringr_1.5.0
[23] zlibbioc_1.44.0 MatrixGenerics_1.10.0
[25] Biostrings_2.66.0 munsell_0.5.0
[27] gtable_0.3.3 hwriter_1.3.2.1
[29] codetools_0.2-19 latticeExtra_0.6-30
[31] Biobase_2.58.0 IRanges_2.32.0
[33] GenomeInfoDb_1.34.9 parallel_4.2.3
[35] fansi_1.0.4 scales_1.2.1
[37] DelayedArray_0.24.0 S4Vectors_0.36.2
[39] RcppParallel_5.1.7 XVector_0.38.0
[41] ShortRead_1.56.1 deldir_1.0-9
[43] interp_1.1-4 Rsamtools_2.14.0
[45] ggplot2_3.4.4 png_0.1-8
[47] stringi_1.7.12 dplyr_1.1.2
[49] GenomicRanges_1.50.2 grid_4.2.3
[51] cli_3.6.1 tools_4.2.3
[53] bitops_1.0-7 magrittr_2.0.3
[55] RCurl_1.98-1.12 tibble_3.2.1
[57] crayon_1.5.2 pkgconfig_2.0.3
[59] Matrix_1.5-4 R6_2.5.1
[61] GenomicAlignments_1.34.1 compiler_4.2.3

nuorenarra avatar Dec 08 '23 07:12 nuorenarra

#file paths filtFs<-readRDS("filtFs.rds") filtRs<-readRDS("filtRs.rds")

This is potentially problematic. How stable are the file paths between when you saved these and now? If there is any discrepancy, errors like the one you are seeing are possible.

benjjneb avatar Dec 08 '23 08:12 benjjneb

I made a loop that calls each path with readFastq() (from package Shortread) and checks it is openable. All of the paths seem fine in that the files open OK.

nuorenarra avatar Dec 11 '23 11:12 nuorenarra

I had this same error when calling plotQualityProfile with a vector of file names:

Error: BiocParallel errors
  1 remote errors, element index: 1
  0 unevaluated and other errors
  first remote error: error in evaluating the argument 'dirPath' in selecting a method for function 'qa': error reading from the connection

I isolated the gzip file that was crashing it and tested the file with gzip -t. Corrupted gzip file cause this problem in my case.

isaacovercast avatar Feb 09 '24 21:02 isaacovercast