snippy
snippy copied to clipboard
generate a variant table for dnds calculation
Hello,
I was wondering if there is an easy way to generate a genotype by isolate table for subsequent dnds calculation. I previously used the core.tab file from snippy-core but I think that omits variants if it is not found in all genomes.
Many thanks for your guidance,
Hi @smb20200615,
you can use bcftools merge
to merge the vcf files of your isolates:
bcftools -m none -0 -O z <iso1/snps.vcf><iso2/snps.vcf>... > merged.vcf.gz
You can then either use the merged vcf file as input for your tool or transform it in tabular format using packages such as vcfR
(in R) or scikit-allel
(in python)
@stefanogg thank you so much for your guidance. I had tried that but the issue with merging the variants is that we won't get the null calls - the approach assumes that areas with no variant are the same as the reference which will not be try if we have an N at the site. Can snippy provide info on these null sites?
No, snippy-core doesn't provide details of the null-sites, because it uses a strict definition of core genome (0% gaps). You can use bcftools merge
without the option -0
(--missing-to-ref
) so that N are coded as missing instead of 0/0.
You can also use goalign
or trimal
to gather information on N-sites and then filter the vcf files.
How would bcftools know what is missing/ambiguous (N) if Snippy does not record such positions in its VCF file in the first place? Aren't such sites purposefully left out (i.e. --minfrac parameter) of the final set of SNP calls due to their low confidence?