gvcfgenotyper
gvcfgenotyper copied to clipboard
providing a list of variants to genotype
Is it possible to give gvcfgenotyper a list of chromosome positions to genotype GVCFs at? For example I have two sets of samples joint genotyped using gvcfgenotyper. Some variants are only present in set 1 and I would like to know whether that site could be genotyped in all of the samples in set 2. I am trying to avoid having to re-run gvcfgenotyper on all of the samples combined. Perhaps it is not possible but I wanted to ask?
Hi, you can give gvcfgenotyper a list of regions using the -r commandline arg. The syntax is identical to "bcftools view" : https://samtools.github.io/bcftools/bcftools.html#common_options. We don't support the -R option yet, where the regions can be read from a file but it wouldn't be hard to add this if it helps.
Thanks for getting back to me
We did try the -r commandline argument which works great when we have been multi sample calling variants on a single chromosome. However when we provide -r chr:position we get a blank VCF back (header only). The positon we are querying is not variant in the set of samples being tested, we know that but we were hoping to get back reference or missing genotypes for the samples whose GVCFs we are providing. If is would be possible to support the -R option that would be great, so long as what we are trying to do is actually possible with gvcfgenotyper?
Thanks again for the help!
Claire
From: [email protected] [[email protected]] Sent: 09 May 2019 14:28 To: Illumina/gvcfgenotyper Cc: Claire Palles (Institute of Cancer and Genomic Sciences); Author Subject: Re: [Illumina/gvcfgenotyper] providing a list of variants to genotype (#9)
Hi, you can give gvcfgenotyper a list of regions using the -r commandline arg. The syntax is identical to "bcftools view" : https://samtools.github.io/bcftools/bcftools.html#common_options. We don't support the -R option yet, where the regions can be read from a file but it wouldn't be hard to add this if it helps.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/Illumina/gvcfgenotyper/issues/9#issuecomment-490904416, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AKFR4TOGOPFQLSNY6XLWHLLPUQRGLANCNFSM4HLZCZSA.
Hi Claire,
I realized that I did not really understand your problem. Thanks for the clarification.
This is not directly supported by gvcfgenotyper but you could try to "hack" it.
If you are familiar with bcftools, you could use "bcftools view" to slice out the variants that you are interested in and them in a vcf.gz or bcf file. So something like :
bcftools view -r region_around_variants -Ob -o variants.bcf
You could then add this variants.bcf files to the list of input files for gvcfgenotyper and run gvcfgt only on these regions (this is important).
I have not tested this myself but I think this should work.
Adding an option for gvcfgenotyper to force-genotype a set of variants is an interesting idea. It is not straightforward to add. I cannot make any promises but I could add it to the backlog and see if we find time to do it.
Thanks for the speedy reply and the suggested hack. We will give that a go. If you could also add the force-genotype option to your list of possible future jobs I would be grateful.
Many thanks
Claire
From: [email protected] [[email protected]] Sent: 10 May 2019 13:04 To: Illumina/gvcfgenotyper Cc: Claire Palles (Institute of Cancer and Genomic Sciences); Author Subject: Re: [Illumina/gvcfgenotyper] providing a list of variants to genotype (#9)
Hi Claire,
I realized that I did not really understand your problem. Thanks for the clarification.
This is not directly supported by gvcfgenotyper but you could try to "hack" it.
If you are familiar with bcftools, you could use "bcftools view" to slice out the variants that you are interested in and them in a vcf.gz or bcf file. So something like :
bcftools view -r region_around_variants -Ob -o variants.bcf
You could then add this variants.bcf files to the list of input files for gvcfgenotyper and run gvcfgt only on these regions (this is important).
I have not tested this myself but I think this should work.
Adding an option for gvcfgenotyper to force-genotype a set of variants is an interesting idea. It is not straightforward to add. I cannot make any promises but I could add it to the backlog and see if we find time to do it.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/Illumina/gvcfgenotyper/issues/9#issuecomment-491265136, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AKFR4TN43R2NOHRUDVZMP6LPUVQGJANCNFSM4HLZCZSA.
Hello,
I tried this hack as suggested but I get an error explaining that the GVCFs are interrupted (not contiguous) and therefore, it terminates.
You could then add this variants.bcf files to the list of input files for gvcfgenotyper and run gvcfgt only on these regions (this is important). -- I am assuming I may be missing something in the code to ensure that it is only this region? gvcfgenotyper -f path/to/fasta -l listofGVCFpaths.txt -Ob -o genotyped_GVCFs.bcf What if I have variants from multiple regions?
Hi,
you can restrict gvcfgenotyper to a region use the -r command line argument. The similar is the same as in bcftools (chrom:start-end).
If you have variants from multiple regions, I recommend cutting out several slices around these variants using the bcftools view command mentioned above and then running several gvcfgenotyper jobs, one for each slice.
I also need this functionality - both the force output of reference genotypes and the list of input variants. The hack hasn't worked for me either.