methphaser icon indicating copy to clipboard operation
methphaser copied to clipboard

Duplicated and missing entries in .methphased.vcf

Open liuy0421 opened this issue 1 year ago • 6 comments

When I look at the entries in the output .methphased.vcf, there seems to be complete duplicate rows (not just that one variant is methphased into different haplotypes and therefore split into multiple entires with different PS tags, but entirely duplicate entires). This is not a big issue because complete duplicates can be easily dropped - but what could be causing this? Is this intentional?

There also seems to be variants in the original .vcf file that's missing from the output .methphased.vcf, including some that was phased in the original .vcf. Is that intentional?

The original .vcf was filtered to only contain those on autosomes, and the input .bam files are filtered to only contain primary alignments but were not filtered to only keep those that map to autosomes.

Thank you so much!

liuy0421 avatar Jun 29 '23 19:06 liuy0421