RNA-Bloom
RNA-Bloom copied to clipboard
RNA-Bloom Generates Empty FASTA Without Error
As per title. Input file: test.fastq.gz
Command:
rnabloom -t 2 -outdir test_out -long test.fastq -ntcard
It should probably again report too little input data? Big thanks for all of your help!!
RNA-Bloom v2.0.0
java --version
openjdk 17.0.3-internal 2022-04-19
OpenJDK Runtime Environment (build 17.0.3-internal+0-adhoc..src)
OpenJDK 64-Bit Server VM (build 17.0.3-internal+0-adhoc..src, mixed mode, sharing)
Thanks for reporting this! Yes, this happens when there are too few reads.
I was able to replicate this, but this is not a bug.
The assembled sequences are too short and they all end up in rnabloom.transcripts.short.fa
(instead of rnabloom.transcripts.fa
).
I have added a warning message for this scenario. The changes will be incorporated in the next release!
What is the difference between these files besides above/below length threshold? Is there evidence that the longer transcripts are better supported/higher quality?
Not at all. The length threshold is the only determining factor for assigning sequences to these two files.
Not at all. The length threshold is the only determining factor for assigning sequences to these two files.
Cool. If that's the case, why separate the files at all? Why not have a single assembly output file, with an optional param to filter contigs shorter than x length, with default x=0?
There is already an option for that (i.e. -length
) and its default value is 200, which is what separates the sequences in the two files. All RNA-seq assemblers I can think of have a similar length cutoff option and its default is 100~200 nt. It is not set to zero because very short sequences can potentially be noise.
Thanks for explaining. Contrary to your earlier answer then, it does sound like there is evidence that the longer transcripts are likely higher quality. I guess a warning message will suffice if the non-short transcripts file is empty. Thanks again!
Sorry, I thought you were asking whether RNA-Bloom use any evidence to determine that threshold.