bcftools
bcftools copied to clipboard
bcftools +mendelian plugin query (help needed)
Hi,
I am using bcftools v1.14. I am working with trios and I was using bcftools +mendelian plugin in order to retain consistent records i.e. I wanted to retain sites or VCF records the where the trio had no missingness and no Mendelian error. I am using this command to achieve this:
bcftools +mendelian -m + --trio Mother,Father,Proband trio_precise_GQ20_snpEff.vcf -O v -o trio_precise_GQ20_snpEff_consistant.vcf
This issue I am facing that there are records with missing values which remain in the resulting VCF file. If I just print the consistancy counts report, the number of output VCF sites is same as nOK of the count report. Why do missing values still remain in the consistant records output file? Is it the correct behavior or am I understanding it wrongly?
Regards, Prasun
It appears there is no mode to list consistent sites with non-missing genotypes. This could be added. In the meantime, you could removing sites with missing genotypes by piping through bcftools view -e 'GT="mis"'
Thanks a lot @pd3! I have done exactly what you suggested downstream, but have used bcftools filter instead (bcftools filter -e 'GT[*] = "./."')
Shouldn't the definition of nOK should be changed because currently it is defined as "nOK .. number of genotypes at which the trio had no missingness and no Mendelian error" (https://samtools.github.io/bcftools/howtos/plugin.mendelian.html). This is becasue the consistant sites do have missing value which brings me to the next question- if there were missing values, why did the plugin put that site into consistant site? Shouldn't it be skipped? Please correct me if I have incorrectly understood something.
P.S. I am working with structural variants trios.
Hopefully the new version of the plugin, +mendelian2
, resolves this issue. See also https://github.com/samtools/bcftools/issues/1738#issuecomment-1298518068