SURVIVOR icon indicating copy to clipboard operation
SURVIVOR copied to clipboard

SV coordinates merged VCF from Manta and GRIDSS

Open tgong1 opened this issue 2 years ago • 3 comments

Hi,

I was merging VCFs from Manta and GRIDSS using SURVIVOR, while cannot understand many of the resulted SV coordintaes. Please see the following example (command: SURVIVOR merge sample_files 200 2 0 0 0 50 sample_merged.vcf): This is one SV in merged VCF: grep "gridss0f_11555b" sample_merged.vcf chr1 1924217 gridss0f_11555b C CAGCTCACAGCCCACCCCCCCATCTCACCGCCCAGCCCCCCCATCTCACCAGCTGCCCCCTCCCCGACACACCGCCCACCCCCCCATCTCACCAACCACCCCCCTCCAGCTCACCACCC. 1869 PASS SUPP=2;SUPP_VEC=11;SVLEN=118;SVTYPE=INV;SVMETHOD=SURVIVOR1.0.7;CHR2=chr1;END=1924316;CIPOS=0,16;CIEND=0,35;STRANDS=++ GT:PSV:LN:DR:ST:QV:TY:ID:RAL:AAL:CO ./.:NA:118:0,0:++:1869,685:INV,INV:gridss0f_11555b:C:CAGCTCACAGCCCACCCCCCCATCTCACCGCCCAGCCCCCCCATCTCACCAGCTGCCCCCTCCCCGACACACCGCCCACCCCCCCATCTCACCAACCACCCCCCTCCAGCTCACCACCC.:chr1_1924217-chr1_1924316,chr1_1924233-chr1_1924351 1/1:NA:103:0,0:+-:312:INS:MantaINS_2_188_188_0_0_0:G:GACCACCCCCCAGCTCACAGCCCACCCCCCCATCTCACCGCCCAGCCCCCCCATCTCACCAGCTGCCCCCTCCCGGGCACACCGCCCACCCCCCCATCTCACCA:chr1_1924223-chr1_1924223

By looking at the GRIDSS and Manta VCF individually, I can't understand where the coordinate chr1_1924217_1924316 in merged VCF come from.

grep "gridss0f_11555b" gridss.B.sv.vcf chr1 1924233 gridss0f_11555b C CAGCTCACAGCCCACCCCCCCATCTCACCGCCCAGCCCCCCCATCTCACCAGCTGCCCCCTCCCCGACACACCGCCCACCCCCCCATCTCACCAACCACCCCCCTCCAGCTCACCACCC. 685.73 LOW_QUAL AS=0;ASC=1X;ASQ=0;ASRP=0;ASSR=0;BA=1;BANRP=0;BANRPQ=0;BANSR=0;BANSRQ=0;BAQ=342.87;BASRP=2;BASSR=12;BEALN=chr1:1924132|+|33M2I41M1D42M|10;BEID=asm0-24467;BEIDH=-1;BEIDL=3;BQ=685.73;BSC=12;BSCQ=287.53;BUM=2;BUMQ=55.33;BVF=14;CAS=0;CASQ=0;CQ=934.08;EVENT=gridss0f_11555;IC=0;IQ=0;RAS=0;RASQ=0;REF=27;REFPAIR=22;RP=0;RPQ=0;SB=0;SC=1X;SR=0;SRQ=0;SVTYPE=BND;VF=0 GT:ASQ:ASRP:ASSR:BANRP:BANRPQ:BANSR:BANSRQ:BAQ:BASRP:BASSR:BQ:BSC:BSCQ:BUM:BUMQ:BVF:CASQ:IC:IQ:QUAL:RASQ:REF:REFPAIR:RP:RPQ:SR:SRQ:VF .:0:0:0:0:0:0:0:96.9:1:3:193.79:3:68.57:1:28.32:4:0:0:0:0:0:11:10:0:0:0:0:0 .:0:0:0:0:0:0:0:245.97:1:9:491.94:9:218.96:1:27.01:10:0:0:0:0:0:16:12:0:0:0:0:0

grep "MantaINS:2:188:188:0:0:0" manta.B.PASS.recode.vcf chr1 1924223 MantaINS:2:188:188:0:0:0 G GACCACCCCCCAGCTCACAGCCCACCCCCCCATCTCACCGCCCAGCCCCCCCATCTCACCAGCTGCCCCCTCCCGGGCACACCGCCCACCCCCCCATCTCACCA 312 PASS END=1924223;SVTYPE=INS;SVLEN=103;CIGAR=1M103I;CIPOS=0,10;HOMLEN=10;HOMSEQ=ACCACCCCCC GT:FT:GQ:PL:PR:SR 1/1:PASS:23:365,26,0:0,0:0,11

Thank you for your time and help! Tingting

tgong1 avatar Jun 27 '22 08:06 tgong1

Dear @tgong1

I'm not one of the developers.. Just wanted to ask if you could realize the meaning of that coordinates?

And have you noticed the SV type that SURVIVOR has reported for this SV? Although Manta says it's an insertion and GRIDSS says it's a BND (I don't know if you had used the R script provided on GRIDSS GitHub page for annotating the SVs or not), SURVIVOR reports this SV as an inversion (SVTYPE=INV). I have the same thing happening on many of my SVs... Actually three fourth of my SVs are reported to be INV by SURVIVOR!!!!! Do you happen to have any ideas on this?

Sorry for taking up your time and thanks in advance, Best

dr-ashu-geno avatar Jan 30 '23 02:01 dr-ashu-geno

Hi @dr-ashu-geno,

I didn't get reply from the author, so still have no idea why SURVIVOR reported it as INV. To me, the SV in the example I shown is an insertion, as both Manta and GRIDSS recovered and reported the inserted sequence. I have used my own method to classify the type and merge SV calls (https://github.com/tgong1/StructuralVariantUtil). But sorry I haven't have time to write a detailed guidance yet.

Regards, Tingting

tgong1 avatar Jan 30 '23 03:01 tgong1

Dear Tingting

Thank you so much for the quick reply. I will take a look at your tool to see if I can use it on my own data.

Thanks, Best regards,

dr-ashu-geno avatar Jan 30 '23 04:01 dr-ashu-geno