mehari icon indicating copy to clipboard operation
mehari copied to clipboard

Implement multiallelic variants/variant normalization

Open holtgrewe opened this issue 1 year ago • 2 comments

Is your feature request related to a problem? Please describe. Currently, mehari does not support multi-allelic sites. This is a big limitation and requires bcftools norm -m -any --force.

Describe the solution you'd like Allow mehari to process multi-allelic sites. Mehari will also need to normalize the sites for precise predictions for which it will need to be given the FASTA reference. When writing out the split/normalized records, mehari will need to sort the variants again so the resulting VCF file is sorted.

Describe alternatives you've considered N/A

Additional context N/A

holtgrewe avatar May 16 '24 06:05 holtgrewe

But why should normalization functionality be replicated in mehari? Makes much more sense to stick to bcftools norm, since that already exists and is (in some way or another) tried and tested and there's no need to maintain even more functionality. When encountering multi-allelic sites, simply bail and remind the user to normalize.

tedil avatar May 21 '24 14:05 tedil

VEP can handle it so should we...

holtgrewe avatar May 21 '24 16:05 holtgrewe