tombo
tombo copied to clipboard
DNA vs RNA from raw reads
Hi there,
I don't have an issue as such, more a (hopefully) quick question about how Tombo determines whether a sample is DNA or RNA from the raw reads in this scenario below:
"If DNA or RNA sample type is not explicitly specified (via --dna or --rna options) the sample type will be detected automatically from the raw read files."
I have generated DNA samples that we would expect to have large amounts of replacement of T with U, and I noticed (initially accidentally) that Tombo is picking these samples up as RNA if I don't specify the type. Is there something specific that Tombo looks for in the raw reads to determine the read type?
Many thanks!
There is not an attribute in the FAST5 indicating whether the sample is DNA or RNA, so the function in tombo to guess this is found here: https://github.com/nanoporetech/tombo/blob/master/tombo/tombo_helper.py#L872 I would suggest as you have noted to use the --dna and --rna flags whenever possible to avoid this function guessing incorrectly.