Is it possible to get transcript coverage estimates from Bambu?

Open bernardo-heberle opened this issue 3 years ago • 1 comments

Hi,

I am working on a project where it would be helpful to have coverage estimates for transcripts. I know that counts should theoretically be a good estimate of coverage since we are dealing with long-reads, but RNA/cDNA quality scores are never perfect and it is expected that there will be a significant number of sheared/degraded molecules being sequenced.

I was wondering if you would consider adding coverage estimates for transcripts to the Bambu output?

To clarify: By coverage I mean the number of bases mapping to a transcript divided by the length of that transcript.

Alternatively, if you have a suggestion on how I can calculate coverage from the current Bambu output I would be happy to create a script to do that. I just couldn't think of a proper way to do it so far.

Thank you, Bernardo

Mar 14 '22 15:03 bernardo-heberle

Hi @bernardo-heberle , this is a good point. We should definitely consider to do that for future release maybe, which probably takes some time to implement.

For now, you may do the following steps to get coverage based on bambu outputs:

create a transcriptome fasta file based on the bambu annotations
then use this fasta file to do transcriptome alignment
calculate coverage based on the transcriptome alignments

Thank you Regards, Ying

Mar 16 '22 08:03 cying111