bambu icon indicating copy to clipboard operation
bambu copied to clipboard

BiocParallel error when using Bambu in quantification mode

Open apsteinberg opened this issue 1 year ago • 2 comments

Hi Andre and the Bambu team,

I am encountering an error when I am running Bambu in quantification mode on some of my samples. After our discussions in issue #448, I decided to try to include all detected transcripts in my analysis as NDR < 0.4 for all of the transcripts. I am finding however that for some of my samples now I am encountering a BiocParallel which reads:

Error: BiocParallel errors
  1 remote errors, element index: 1
  0 unevaluated and other errors
  first remote error:
Error in if (annotatedIntronNumberNew > annotatedIntronNumber & !is.na(annotatedIntronNumber)) {: missing value where TRUE/FALSE needed
In addition: There were 15 warnings (use warnings() to see them)

The commands I am using to run Bambu are:

annotations <- prepareAnnotations(gtf.file)

create_directory(outdir)

se.quantOnly <- bambu(reads = input_bam,
                      annotations = gtf.file,
                      genome = fa.file,
                      discovery = FALSE,
                      verbose=TRUE,
                      lowMemory=TRUE,
                      yieldSize=1e6)

writeBambuOutput(se.quantOnly, path = outdir)

se_out <- sprintf("%s/se.quantOnly.RData",outdir)
save(se.quantOnly, file = se_out)

I am attaching the full log here in case it is helpful:

quant_biocparallel_error.zip

This was run with bambu 3.5.1. Any clues as to what's going on here?

Thanks, Asher

apsteinberg avatar Sep 23 '24 21:09 apsteinberg

Hi @apsteinberg ,

Sorry for the late reply!

Based on the error message, it looks like a few samples had zero reads aligned to the reference genome. This could be due to data quality issues or a mismatch between the genome FASTA and the reads (e.g., seqname style mismatch or out-of-range sequences). You'll need to investigate further to pinpoint the cause.

I recommend pre-checking each sample and using only those with at least some reads.

Hope this helps! Let me know if you have any other questions. Thank you Warm regards, Ying

cying111 avatar Feb 03 '25 06:02 cying111

Hi @apsteinberg ,

Sorry for the late reply!

Based on the error message, it looks like a few samples had zero reads aligned to the reference genome. This could be due to data quality issues or a mismatch between the genome FASTA and the reads (e.g., seqname style mismatch or out-of-range sequences). You'll need to investigate further to pinpoint the cause.

I recommend pre-checking each sample and using only those with at least some reads.

Hope this helps! Let me know if you have any other questions. Thank you Warm regards, Ying

Hi @cying111 ,

I also encountered the same error when the "lowMemory=TRUE" is setted. However, when I set "lowMemory=FALSE", the error information disppared. I am pretty my 99% percent of the reads were aligned to ref genome by minimap2, so, I guess that if some chromosomes do not have any reads mapped, the error may occour because lowMemory process the data by chromosome? Also I saw the same situation at https://github.com/GoekeLab/bambu/issues/439. Maybe this can be fixed in the next version.

hmutpw avatar Feb 21 '25 08:02 hmutpw