pyfaidx icon indicating copy to clipboard operation
pyfaidx copied to clipboard

Support for indels in FastaVariant class

Open danielecook opened this issue 9 years ago • 4 comments

Hello - I was wondering if it would be possible to recognize indels within the fastavariant class? What are the challenges involved there?

danielecook avatar Dec 17 '15 18:12 danielecook

Hmmm... I haven't paid much attention to the FastaVariant class recently. I think indels shouldn't be too hard, but I omitted them originally as I wanted to maintain a 1-1 mapping with the original reference coordinates. Do you have a use case for this?

mdshw5 avatar Dec 31 '15 15:12 mdshw5

Sure - well, I can describe my reason for wanting this implemented.

I am developing a number of utilities for working with VCF files. One of the tools is aimed at helping to validate variants within VCFs. It generates primers (using primer3) for sanger sequencing or snip-SNP verification based on any variants that are provided as input. However, when generating primers or looking for restriction sites, I want to account for neighboring variation to increase the changes that primers work (by incorporating alternative alleles/indels) or predicted product sizes (resulting from differences in restriction sites) are accurate.

In terms of coordinates - I don't think it is an issue? I always intend to work off of reference coordinates and account for differences afterwords. In other words, if I slice from I:1-100, and this region contains an insertion at 50, it should return the reference from 1-100, and THEN add the insertion. The resulting string will be longer than 100 bp. For a deletion, the string would be shorter than 100. Are there any reasons why this might be an issue?

Thanks for your continual support of pyfaidx - it has been very useful.

danielecook avatar Dec 31 '15 20:12 danielecook

Would be nice to have indels implemented same way as "bcftools consensus" works. Now I run bcftools as subprocess to incorporate all VCF records in fasta regions

Thank you in advance

Benja1972 avatar Feb 20 '19 10:02 Benja1972

Thanks for the feedback. I agree that the bcftools model is appropriate, and if I can get some time, or someone willing to help with the implementation, it will get done :).

mdshw5 avatar Feb 20 '19 16:02 mdshw5