sangeranalyseR
sangeranalyseR copied to clipboard
can sangeranalyseR handle spurious indels?
This is difficult to provide a reprex as I don't want to just paste my sequences here. But I am wondering if sangeranalyseR
is the appropriate tool when there may be spurious indels in the raw reads. I am sequencing a single coding gene (ca. 1300 bp) with four primers (a pair of sequencing primers and a pair of internal primers) with no introns, so the consenus should just be a single coding region with no indels. I am trying to use a reference AA sequence (refAminoAcidSeq
), but I still have problems with obviously wrong frame shifts: portions of the contig where things are off by a single base, resulting in a completely jumbled consenus. I know that sangeranalyseR
can't be used to edit the reads, but I'm surprised it apparently can't handle these sorts of sequencing errors, which are very common in my experience. Or it could just be my settings? Please let me know if perhaps I should share a few example reads via email or such.
hi @joelnitta,
If you could share an example with us via email (whatever the smallest example that doesn't work how you want it, but seems like it should be something that we could implement). We'll take a look.
I agree that in principle this doesn't seem like it should be too hard. Simple matching a sequence up with indels to a back-translated AA reference sequence should fix the problem, right? It's a while since I wrote in the refAminoAcidSeq feature, so I'll have to dig into what it's doing. Then we can see if we can improve it.
Rob