SURVIVOR
SURVIVOR copied to clipboard
SURVIVOR filtering variants during merge as being supported by zero callers
Hi there,
I'm trying to use SURVIVOR to merge matching tumour and normal vcfs generated by sniffles from PromethION data. However, this seems to be erroneously losing variants during the merge. I've isolated one particular variant, which has extensive support in the tumour (9 reads), and which we know from previous work is a real variant in this cell line.
Minimal tumour.vcf (minus most of the header):
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 20190816_COLO829.fastq.mm2.sorted.bam
1 207981231 1293 N <DEL> . PASS PRECISE;SVMETHOD=Snifflesv1.0.11;CHR2=1;END=208014820;STD_quant_start=2.420153;STD_quant_stop=3.000000;Kurtosis_quant_start=-1.580012;Kurtosis_quant_stop=-0.370370;SVTYPE=DEL;RNAMES=3463dfad-992e-4068-891c-22215f043d06,5cbaae2b-62dc-45ab-a993-daaacf350847,a30fb0cc-bca6-4239-a926-9d3c954e4cc2,afdfd7be-0651-4099-880c-4385ceacd6af,cffe5e9c-5783-427c-a3ad-32728023be77,ecc28152-3598-4f18-8085-3bd146b919d7,f4c7b59b-2898-401b-8b5e-4a4924cb7bcd;SUPTYPE=SR;SVLEN=-33589;STRANDS=+-;RE=7;REF_strand=9,8;AF=0.291667 GT:DR:DV 0/0:17:7
Minimal normal.vcf:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 20190816_COLO829_BL.fastq.mm2.sorted.bam
12 95244024 13927_0 GATCTTATAACTAGAAAAACCTAAAGACTCCACCAAAAAACTCTTAGATCTGATAAATAAATTCAGTAAAGCTTCAGGTACAAAATCAACACACAAAAATCGGTAGCATTTCTATACACCAATAATGAACTTGCTGAGAAAGAAATCAAGAAGGCAATCCCATTTACAATAGCTATAAAAAATAGAATATCTAACAATAAATTTAACCAAGGAGGTTGTCTTAGTCCATTTGTGTAGCTACATCTGAGGCTGGGTAATCTATAAAGAAAAGAGGTTTATTTGGCTAATGGTTCTACAGGCTGTACAAGAAGCACAGCACCAATATCTGCTACTGGAGAGGGCTTCCCGGCTGCTTCTACTCATGGCAGAAGGAGAACGGGAGCTGTTGTATGCAGAGATCATATGGTGAGAGAGAGGAAGCAA N . PASS PRECISE;SVMETHOD=Snifflesv1.0.11;CHR2=12;END=95244447;STD_quant_start=0.447214;STD_quant_stop=0.000000;Kurtosis_quant_start=5.871518;Kurtosis_quant_stop=6.574891;SVTYPE=DEL;RNAMES=1b631bb1-ac12-40c8-b950-34368382ed47,1c541b4f-cd09-4d46-b226-5701711f0bbc,20bca30e-af02-40cc-a723-3e0bff5bc395,21faee21-34b9-4d58-b23c-924199c39a61,24ec0c78-fd77-4933-881a-4f1bc0351678,274a6148-f834-4ee9-b763-8171535a07ee,2ea2a668-4c94-4e95-aa0e-065e2b0f9076,41038240-1471-4c93-8a0b-c9569d185e30,69c904ed-e0cc-4013-8c95-bffe93ae1d31,6c1c8cc4-83d2-4a60-98f6-4a690b77dfc2,7f6d4019-3415-4ded-8089-d792ebc79ed5,a4a2b758-3b46-4a23-8719-40650570aedd,df9b5bc1-d5fb-4d76-bcb6-22bb3053a7ca,f48b0ecc-6a5f-4bb8-8fee-36cbc2eab577;SUPTYPE=AL;SVLEN=-423;STRANDS=+-;RE=14;REF_strand=2,3;AF=0.736842 GT:DR:DV 0/1:5:14
merged vcf (survivor_test.vcf):
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 20190816_COLO829_BL.fastq.mm2.sorted.bam20190816_COLO829.fastq.mm2.sorted.bam
1 207981231 1293 N <DEL> . PASS SUPP=0;SUPP_VEC=00;SVLEN=-33589;SVTYPE=DEL;SVMETHOD=SURVIVOR1.0.6;CHR2=1;END=208014820;CIPOS=0,0;CIEND=0,0;STRANDS=+- GT:PSV:LN:DR:ST:QV:TY:ID:RAL:AAL:CO ./.:NaN:0:0,0:--:NaN:NaN:NaN:NAN:NAN:NAN 0/0:NA:33589:17,7:+-:.:DEL:1293:NA:NA:1_207981231-1_208014820
12 95244024 13927_0 GATCTTATAACTAGAAAAACCTAAAGACTCCACCAAAAAACTCTTAGATCTGATAAATAAATTCAGTAAAGCTTCAGGTACAAAATCAACACACAAAAATCGGTAGCATTTCTATACACCAATAATGAACTTGCTGAGAAAGAAATCAAGAAGGCAATCCCATTTACAATAGCTATAAAAAATAGAATATCTAACAATAAATTTAACCAAGGAGGTTGTCTTAGTCCATTTGTGTAGCTACATCTGAGGCTGGGTAATCTATAAAGAAAAGAGGTTTATTTGGCTAATGGTTCTACAGGCTGTACAAGAAGCACAGCACCAATATCTGCTACTGGAGAGGGCTTCCCGGCTGCTTCTACTCATGGCAGAAGGAGAACGGGAGCTGTTGTATGCAGAGATCATATGGTGAGAGAGAGGAAGCAA N . PASS SUPP=1;SUPP_VEC=10;SVLEN=-423;SVTYPE=DEL;SVMETHOD=SURVIVOR1.0.6;CHR2=12;END=95244447;CIPOS=0,0;CIEND=0,0;STRANDS=+- GT:PSV:LN:DR:ST:QV:TY:ID:RAL:AAL:CO 0/1:NA:423:5,14:+-:.:DEL:13927_0:GATCTTATAACTAGAAAAACCTAAAGACTCCACCAAAAAACTCTTAGATCTGATAAATAAATTCAGTAAAGCTTCAGGTACAAAATCAACACACAAAAATCGGTAGCATTTCTATACACCAATAATGAACTTGCTGAGAAAGAAATCAAGAAGGCAATCCCATTTACAATAGCTATAAAAAATAGAATATCTAACAATAAATTTAACCAAGGAGGTTGTCTTAGTCCATTTGTGTAGCTACATCTGAGGCTGGGTAATCTATAAAGAAAAGAGGTTTATTTGGCTAATGGTTCTACAGGCTGTACAAGAAGCACAGCACCAATATCTGCTACTGGAGAGGGCTTCCCGGCTGCTTCTACTCATGGCAGAAGGAGAACGGGAGCTGTTGTATGCAGAGATCATATGGTGAGAGAGAGGAAGCAA:N:12_95244024-12_95244447 ./.:NaN:0:0,0:--:NaN:NaN:NaN:NAN:NAN:NAN
SURVIVOR command:
SURVIVOR merge survivor_mini_test.txt 1000 0 0 0 0 50 survivor_test.vcf
Note how the tumour variant has SUPP_VEC=00. I had to set the num_callers parameter to 0 to get it to be included.
And, now that I think about it, I see the sentence "NOTE ./. or 0/0 is not counted as supporting a variant." in the docs. Since this is a subclonal heterozygous tumour variant, the allele frequency is less than 0.5, and Sniffles called the genotype as 0/0.
It might be helpful to have something in the documentation noting that num_callers can be set to zero to include variants like these.
Thanks for reaching out. This is indeed a point that I am also not sure what would be the smartest way. For de novo calls like yours 0/0 could be taken into account . For force calling (genotyping of known svs) a 0/0 should not be taken into account.
I will try to highlight this better. Thanks Fritz
Thanks for the response! (And for providing the tool in the first place).