truvari
truvari copied to clipboard
merging different SV type?
Version : Truvari v4.2.2-dev 1cd03b2f4e8afdb3431595fa351501b36db3cfd8
Describe the bug :
I'm trying to use Truvari merging to merge a file containing sv calling coming from 4 different callers. Actually I did this thing with SURIVOR instead of bcftools merging because the latter has some problems putting together the different field coming from the 4 different caller. I did it in a very stringent mode to not do overmerging and leave this part to Truvari. I used Truvari collapse with --intra on the file and everything worked good. Then I checked the difference between the two vcf files using bcftools isec and I noticed that at least in this case Truvari seems to discard this data or, worse merging different SV type starting more or less in the same point.
To Reproduce : truvari collapse --intra --keep maxqual --gt het --chain -i sample_merged_10bp.sorted.vcf.gz -o truvari-dev_merge.vcf -c truvari-dev_collapsed.vcf -f /home/fabbial/reference/all.chrs.con.fasta
Expected behavior : A clear and concise description of what you expected to happen.
Example Data : If applicable/possible, add example data to help recreate your problem.
Additional context : Original file:
Chr10 15693 pbsv.INS.9527 G GGTTAATGTGGACCCCCGTTTTTATAACAGGAATCAATCGTGTAAACAGTACCATTTCCCTGGATCAAGTAGTTTGGTACACACGAATTCATAGCTGGAATATCAAGTGCACATACACGAGTGTTTAATACGAATTATTACATCATAGGCCCGCAGGCATAGCCTTAAATCATCGGACCGAGAGGTCCGGAATACATCCAAAAGCAGAAATAGATAAGGCAGCGGAATTCAGGCAGCGGCATGCCATTGCCACAGGCAACGACCGTGCTAAGACCTACTGGACGCCATCGTCTTCATCTCCTTCCTGATAATAGGCATAGGGATCCTCCGAGGCTTCATCTCCCACTTCTGATAAATTTTATATTTGCAAGGATGAGTACCAACCGTACTCAGCAAGCCACCACAGCAACAATGCATATGAAAGGGGGAGTTCAAAGGATGGCCATAGTTCTTTTGCGCAAAGCAAGTTTTGTAATTCTTTTCACAAGCCTAAGACCTAGCATTGACTGATCAAATTTTTAGTACCAGAGTTTGTATTTAAACAACGACGGTTCTGTCCACCATCCATTGTGATCCCAAAGCTTCCCGCCATTGATTCGTCATGGTTTTCTGAGGACGTCCACCTTCCCGCCTCTCAGGAAGTGGCTCCAACAGCATAAAATTCATCATGCAATATCCCATCCCACACAAGTTAAGAATTTAGAGTCTAGCCAAGTGTAATACATGTCCCGGTGCTCAATAACCGCGAGCACGGCTATTCGAATAGGTTTGGTTTACTCACACTGCAGTGGATGTACACTTTACCCGCACTCCGCGACTGCCCAACACATGAGCCTCGTCCCAACACATGAGACGCGTCACGGCAAAGCTTTTCGATAACCTCGCATTGGCAGTACCCGCTCCAGGAACTTTTCATCCTCATGCACTCTAGGAATACACGGTTTCTAGCAGTGAGAGGAGTTCTGGCGCACCCGGGAAGGGAAGACTCACACATGCATTAAGTTATAATTATGTTTTAGATTCTCACATGGCAGTCCTACCGATGGCGACACCACTGTAGACACCCTCCTCGCGGTCCTACCAATGGCTGCCCCACCGTAGAGCCCCTGCCTCACACATCAAGAAACCACTATGCATGGATACTGCCTCCGCTCAGCTATCTACTCCGCTAGGTCTATACCCATACGAGAAGTGCGGTTGTACGGGGGTCGTTTCATGCTTAACCTCATGGCTCGGTCCTTAATTGACCAGGGACGGCACTAGCCTTTTCCGGACACCACCCAAGTCCTCCAGCCGCCCCAGTCGAAAACAGTTGTTTTACTTTATTTTCCTTTCACAAATTATGTCATCAATATCATGGCAATGTGGCGCTCATGTCTCCACATGCCGCATCTCAATTACCTTCCCAAAGGTAATTGCCCAAGCATATAGCATTTGATAAATATGAGTATGCATGAATCTAAAATAGCATTTCTAAGCAAGTGTCATAGTTGACTAGGGACTCGTACGTATCCATGGTTACAAAGATTTAAAGGTGAACAATAATCAAGGCATGGCATAATCACAAGTAGGAGGTTCATAATTGCATGCAATTTTATTTATAAACAAAAGAATTTCGCAATTGGGATCAACATGTTCAAGGAATAGTGATGACTTGCCTTGCTCGAGGTCTTGCGGGTCTTGGCCTTCACCTGGATCCGCGGCTCCCTCGGTCTCTATAGTTACGTGCGAAAATTGATTTGAATTCGGTTAGAATTCAAATAAAAATCCAAGTAAATCCGAATGGAAGTCGAACGCGAAAGTCAATTCCTTTTTATTAATTTTACTATCCGCGAACTATGGCAAACCCCAATTTTGGTTTAAATTATTTTGCGGTTACAATATTTTGTTGTCGTATGTTTTTAATTTAATCTACCCTAGCATTATATCCATATATTAGGTTAGAAATTTCTTATCGCGAGCTAAATGTCGGGCGGAATCCTAAAATTATCTTATAATATTATACGACTTAATTTAGTCTATGATTAAATATAATACACGGTTAACACCCTAGTAATTAAATCCGAATCGCTACCGTTGATCGATTATTTATAGAGATTACCCAGAAATAATCCACAATAATTTACGAGAATTCATACATTTGTTTAATTATTAATTACATCTAAATTAATACCGCGAATTGATTTCTTATGGAGAGTACTAAAAGTTATCTATAATCTATAGGAATTTATCCCATTATATTTTCATTAATCCTATATTTAAATGCTAGTATTTTAGGTATTTAATTAGTAAGAGTTTATCTAACTATATTTTCATTAATCCTACAATTAAATTCTAATATTTGCAAATATTTTAAATTTCCCCCTAAATTTTTCTTTCTCTTTTTCTTTCCCTTTTTCCTCTCCTCTCTTTTCTTTTCTTTTTTCTTTCTCTCCCTTTCTTTTCTTTTCTCTCCTGGCTTCCTCCTCCCTTCCTCTCCTTCTCTTTCTCGGCTCTCCCTCTCTTTCTCTCGGCTCTCTCTCTTTCCCACCCGAGCGGTGGCGGCGGCGGACTCGAGGGAAACGGAGGGCGACTCATCGGCGGCGGCAGCGGCGATGACGGCGGCAGCGACGGCGGCGCACGGTGGCGCGAGAACGGCGGTGCGGGCACGGAAACGGTCGGCGCGGCAGCACGACGGCGGCGGTGCGGCGGCACTACGATGGCGGCGCGACGGCGTGGAGGAGTGGGTGGTGGAGACGGGGAGAAGGGGGGGGGAATAGTGAGGCTTTTATAGGGGAAGGGAGAGAGATAAGGGGGAAGGAGGAGGAGGGAAGAGGGAAGAGGGGAAGAGGGGAAGAAGAAGAGGGGAAAGAGAGGAGGGGGAGGGGCGGCGACGCGGCTGCGGTGACGGGATGGCGGCGCGGGGCTCGGCGCGCGGCGGGACGCGAGACGCGACGGCGACGGATGAGCGGCGACGGGACGGCGACGCGACGGGCGACGGCGCGCGGCGATGGCGACGAGCGGGCGGCGCGGCGCGGGGCTCGGAGCGGCTCGGCGCGGGAGGGGACGGCGACGCGACGGGCGGTGACGGGGTGCGTGGGCGCGCGGGGCGAGGTGGCAGGCGGCGATGGGACGGCGACGCGACACGGGACGGCGACGCGACGGGCGGTGACGGGGTGCGTGGGCGCGCGGGGCGAGGTGGCAGGCGGCGATGGGACGGCGACGCGACACGGGACGGCGACGCGACGGCGACGGTGATGCGACGGCGAGCGACGCGGCAGAGGGGAGGCGGAGCGGCGCGCGTGGCAGGGGAGGGGAAAGGCTGGGGACCAGGTCGAACACGTGGCGGGCAATGAACAGTGCACTTTTCCAATAAACCGATTTTAGAGTGTTTCTCATATGAATTTGATTCCGAAATTCTTAATTTTTTGCATAAATGAAGTTTTACCCCATATTTATATTATTCTAACTAAAGATTCACCTAATTTAATATCACTCATATTTTGTTTATATAATTCATTTGAATTTTTAATTAAAGTTAATTCTCATTCCATCGTATTAAAATTTAATTGTTGTTAATATGGTTGCGATAACATTTTATTTATTTCCAAACCCACCTAATCTTTATTTTAATTTATATTTTAATTATTTATTTAGCCCACTTGATTTTTAGGGTTTATTCCTAGTTAATTTCCTCCCATTTGTGATCGATGAAATCCGAAATCAAAATCCAATAAAATCTTCGAATAAAATTGGCATGATGCAATTTATTTAAAAAGTTTTTTTTTTTTTGAAGATCAGAATTTTTTTGGAGTCTTTGATTTTGTTGGTCGAATTTTCAGAATGTTACA 133 PASS SUPP=3;SUPP_VEC=1011;SVLEN=3827;SVTYPE=INS;SVMETHOD=SURVIVOR1.0.7;CHR2=Chr10;END=15693;CIPOS=0,1;CIEND=-6,1;STRANDS=+- GT:PSV:LN:DR:ST:QV:TY:ID:RAL:AAL:CO 1/1:NA:3823:1,15:+-:133:INS:cuteSV.INS.838:G:GGTTAATGTGGACCCCCGTTTTTATAACAGGAATCAATCGTGTAAACAGTACCATTTCCCTGGATCAAGTAGTTTGGTACACACGAATTCATAGCTGGAATATCAAGTGCACATACACGAGTGTTTAATACGAATTATTACATCATAGGCCGCAGGCATAGCCTTAAATCATCGGACCGAGAGGTCCGGAATACATCCAAAAGCAGAAATAGATAAGGCAGCGGAATTCAGGCAGCGGCATGCCATTGCCACAGGCAACGACCGTGCTAAGACCTACTGGACGCCATCGTCTTCATCTCCTTCCTGATAATAGGCATAGGGATCCTCCGAGGCTTCATCTCCCACTTCTGATAAATTTTATATTTGCAAGGATGAGTACCAACCGTACTCAGCAAGCCACCACAGCAACAATGCATATGAAAGGGGGAGTTCAAAGGATGGCCATAGTTCTTTTGCGCAAAGCAAGTTTTGTAATTCTTTTCACAAGCCTAAGACCTAGCATTGACTGATCAAATTTTTAGTACCAGAGTTTGTATTTAAACAACGACGGTTCTGTCCACCATCCATTGTGATCCCAAAGCTTCCCGCCATTGATTCGTCATGGTTTTCTGAGGACGTCCACCTTCCCGCCTCTCAGGAAGTGGCTCCAACAGCATAAAATTCATCATGCAATATCCCATCCCACACAAGTTAAGAATTTAGAGTCTAGCCAAGTGTAATACATGTCCCGGTGCTCAATAACCGCGAGCACGGCTATTCGAATAGGTTTGGTTTACTCACACTGCAGTGGATGTACACTTTACCCGCACTCCGCGACTGCCCAACACATGAGCCTCGTCCCAACACATGAGACGCGTCACGGCAAAGCTTTTCGATAACCTCGCATTGGCAGTACCCGCTCCAGGAACTTTTCATCCTCATGCACTCTAGGAATACACGGTTTCTAGCAGTGAGAGGAGTTCTGGCGCACCCGGGAAGGGAAGACTCACACATGCATTAAGTTATAATTATGTTTTAGATTCTCACATGGCAGTCCTACCGATGGCGACACCACTGTAGACACCCTCCTCGCGGTCCTACCAATGGCTGCCCCACCGTAGAGCCCCTGCCTCACACATCAAGAAACCACTATGCATGGATACTGCCTCCGCTCAGCTATCTACTCCGCTAGGTCTATACCCATACGAGAAGTGCGGTTGTACGGGGGTCGTTTCATGCTTAACCTCATGGCTCGGTCCTTAATTGACCAGGGACGGCACTAGCCTTTTCCGGACACCACCCAAGTCCTCCAGCCGCCCCAGTCGAAAACAGTTGTTTTACTTTATTTTCCTTTCACAAATTATGTCATCAATATCATGGCAATGTGGCGCTCATGTCTCCACATGCCGCATCTCAATTACCTTCCCAAAGGTAATTGCCCAAGCATATAGCATTTGATAAATATGAGTATGCATGAATCTAAAATAGCATTTCTAAGCAAGTGTCATAGTTGACTAGGGACTCGTACGTATCCATGGTTACAAAGATTTAAAGGTGAACAATAATCAAGGCATGGCATAATCACAAGTAGGAGGTTCATAATTGCATGCAATTTTATTTATAAACAAAAGAATTTCGCAATTGGGATCAACATGTTCAAGGAATAGTGATGACTTGCCTTGCTCGAGGTCTTGCGGGTCTTGGCCTTCACCTGGATCCGCGGCTCCCTCGGTCTCTATAGTTACGTGCGAAAATTGATTTGAATTCGGTTAGAATTCAAATAAAAATCCAAGTAAATCCGAATGGAAGTCGAACGCGAAAGTCAATTCCTTTTTATTAATTTTACTATCCGCGAACTATGGCAAACCCCAATTTTGGTTTAAATTATTTTGCGGTTACAATATTTTGTTGTCGTATGTTTTTAATTTAATCTACCCTAGCATTATATCCATATATTAGGTTAGAAATTTCTTATCGCGAGCTAAATGTCGGGCGGAATCCTAAAATTATCTTATAATATTATACGACTTAATTTAGTCTATGATTAAATATAATACACGGTTAACACCCTAGTAATTAAATCCGAATCGCTACCGTTGATCGATTATTTATAGAGATTACCCAGAAATAATCCACAATAATTTACGAGAATTCATACATTTGTTTAATTATTAATTACATCTAAATTAATACCGCGAATTGATTTCTTATGGAGAGTACTAAAAGTTATCTATAATCTATAGGAATTTATCCCATTATATTTTCATTAATCCTATATTTAAATGCTAGTATTTTAGGTATTTAATTAGTAAGAGTTTATCTAACTATATTTTCATTAATCCTACAATTAAATTCTAATATTTGCAAATATTTTAAATTTCCCCCTAAATTTTTCTTTCTCTTTTTCTTTCCCTTTTTCCTCTCCTCTCTTTTCTTTCTTTTTTCTTTCTCTCCCTTTCTTTTCTTTTCTCTCCTGGCTTCCTCCTCCCTTCCTCTCCTTCTCTTTCTCGGCTCTCCCTCTCTTTCTCTCGGCTCTCTCTCTTTCCCACCCGAGCGGTGGCGGCGGCGGACTCGAGGGAAACGGAGGGCGACTCATCGGCGGCGGCAGCGGCGATGACGGCGGCAGCGACGGCGGCGCACGGTGGCGCGAGAACGGCGGTGCGGGCACGGAAACGGTCGGCGCGGCAGCACGACGGCGGCGGTGCGGCGGCACTACGATGGCGGCGCGACGGCGTGGAGGAGTGGGTGGTGGAGACGGGGAGAAGGGGGGGGGAATAGTGAGGCTTTTATAGGGGAAGGGAGAGAGATAAGGGGGAAGGAGGAGGAGGGAAGAGGGAAGAGGGGAAGAGGGGAAGAAGAAGAGGGGAAAGAGAGAGGGGGAGGGGCGGCGACGCGGCTGCGGTGACGGGATGGCGGCGCGGGGCTCGGCGCGCGGCGGGACGCGAGACGCGACGGCGACGGATGAGCGGCGACGGGACGGCGACGCGACGGGCGACGGCGCGCGGCGATGGCGACGAGCGGGCGGCGCGGCGCGGGGCTCGGAGCGGCTCGGCGCGGGAGGGGACGGCGACGCGACGGGCGGTGACGGGGTGCGTGGGCGCGCGGGGCGAGGTGGCAGGCGGCGATGGGACGGCGACGCGACACGGGACGGCGACGCGACGGGCGGTGACGGGGTGCGTGGGCGCGCGGGGCGAGGTGGCAGGCGGCGATGGGACGGCGACGCGACACGGGACGGCGACGCGACGGCGACGGTGATGCGACGGCGAGCGACGCGGCAGAGGGGAGGCGGAGCGGCGCGCGTGGCAGGGGAGGGGAAAGGCTGGGGACCAGGTCGAACACGTGGCGGGCAATGAACAGTGCACTTTTCCAATAAACCGATTTTAGAGTGTTTCTCATATGAATTTGATTCCGAAATTCTTAATTTTTTGCATAAATGAAGTTTTACCCCATATTTATATTATTCTAACTAAAGATTCACCTAATTTAATATCACTCATATTTTGTTTATATAATTCATTTGAATTTTTAATTAAAGTTAATTCTCATTCCATCGTATTAAAATTTATTGTTGTTAATATGGTTGCGATAACATTTTATTTATTTCCAAACCCACCTAATCTTTATTTAATTTATATTTTAATTATTTATTTAGCCCACTTGATTTTTAGGGTTTATTCCTAGTTAATTTCCTCCCATTTGTGATCGATGAAATCCGAAATCAAAATCCAATAAAATCTTCGAATAAAATTGGCATGATGCAATTTATTTAAAAAGTTTTTTTTTTTTGAAGATCAGAATTTTTTTGGAGTCTTTGATTTTGTTGGTCGAATTTTCAGAATGTTACA:Chr10_15693-Chr10_15693 ./.:NaN:0:0,0:--:NaN:NaN:NaN:NAN:NAN:NAN 1/1:NA:3829:0,0:+-:.:INS:pbsv.INS.9527:G:GGTTAATGTGGACCCCCGTTTTTATAACAGGAATCAATCGTGTAAACAGTACCATTTCCCTGGATCAAGTAGTTTGGTACACACGAATTCATAGCTGGAATATCAAGTGCACATACACGAGTGTTTAATACGAATTATTACATCATAGGCCCGCAGGCATAGCCTTAAATCATCGGACCGAGAGGTCCGGAATACATCCAAAAGCAGAAATAGATAAGGCAGCGGAATTCAGGCAGCGGCATGCCATTGCCACAGGCAACGACCGTGCTAAGACCTACTGGACGCCATCGTCTTCATCTCCTTCCTGATAATAGGCATAGGGATCCTCCGAGGCTTCATCTCCCACTTCTGATAAATTTTATATTTGCAAGGATGAGTACCAACCGTACTCAGCAAGCCACCACAGCAACAATGCATATGAAAGGGGGAGTTCAAAGGATGGCCATAGTTCTTTTGCGCAAAGCAAGTTTTGTAATTCTTTTCACAAGCCTAAGACCTAGCATTGACTGATCAAATTTTTAGTACCAGAGTTTGTATTTAAACAACGACGGTTCTGTCCACCATCCATTGTGATCCCAAAGCTTCCCGCCATTGATTCGTCATGGTTTTCTGAGGACGTCCACCTTCCCGCCTCTCAGGAAGTGGCTCCAACAGCATAAAATTCATCATGCAATATCCCATCCCACACAAGTTAAGAATTTAGAGTCTAGCCAAGTGTAATACATGTCCCGGTGCTCAATAACCGCGAGCACGGCTATTCGAATAGGTTTGGTTTACTCACACTGCAGTGGATGTACACTTTACCCGCACTCCGCGACTGCCCAACACATGAGCCTCGTCCCAACACATGAGACGCGTCACGGCAAAGCTTTTCGATAACCTCGCATTGGCAGTACCCGCTCCAGGAACTTTTCATCCTCATGCACTCTAGGAATACACGGTTTCTAGCAGTGAGAGGAGTTCTGGCGCACCCGGGAAGGGAAGACTCACACATGCATTAAGTTATAATTATGTTTTAGATTCTCACATGGCAGTCCTACCGATGGCGACACCACTGTAGACACCCTCCTCGCGGTCCTACCAATGGCTGCCCCACCGTAGAGCCCCTGCCTCACACATCAAGAAACCACTATGCATGGATACTGCCTCCGCTCAGCTATCTACTCCGCTAGGTCTATACCCATACGAGAAGTGCGGTTGTACGGGGGTCGTTTCATGCTTAACCTCATGGCTCGGTCCTTAATTGACCAGGGACGGCACTAGCCTTTTCCGGACACCACCCAAGTCCTCCAGCCGCCCCAGTCGAAAACAGTTGTTTTACTTTATTTTCCTTTCACAAATTATGTCATCAATATCATGGCAATGTGGCGCTCATGTCTCCACATGCCGCATCTCAATTACCTTCCCAAAGGTAATTGCCCAAGCATATAGCATTTGATAAATATGAGTATGCATGAATCTAAAATAGCATTTCTAAGCAAGTGTCATAGTTGACTAGGGACTCGTACGTATCCATGGTTACAAAGATTTAAAGGTGAACAATAATCAAGGCATGGCATAATCACAAGTAGGAGGTTCATAATTGCATGCAATTTTATTTATAAACAAAAGAATTTCGCAATTGGGATCAACATGTTCAAGGAATAGTGATGACTTGCCTTGCTCGAGGTCTTGCGGGTCTTGGCCTTCACCTGGATCCGCGGCTCCCTCGGTCTCTATAGTTACGTGCGAAAATTGATTTGAATTCGGTTAGAATTCAAATAAAAATCCAAGTAAATCCGAATGGAAGTCGAACGCGAAAGTCAATTCCTTTTTATTAATTTTACTATCCGCGAACTATGGCAAACCCCAATTTTGGTTTAAATTATTTTGCGGTTACAATATTTTGTTGTCGTATGTTTTTAATTTAATCTACCCTAGCATTATATCCATATATTAGGTTAGAAATTTCTTATCGCGAGCTAAATGTCGGGCGGAATCCTAAAATTATCTTATAATATTATACGACTTAATTTAGTCTATGATTAAATATAATACACGGTTAACACCCTAGTAATTAAATCCGAATCGCTACCGTTGATCGATTATTTATAGAGATTACCCAGAAATAATCCACAATAATTTACGAGAATTCATACATTTGTTTAATTATTAATTACATCTAAATTAATACCGCGAATTGATTTCTTATGGAGAGTACTAAAAGTTATCTATAATCTATAGGAATTTATCCCATTATATTTTCATTAATCCTATATTTAAATGCTAGTATTTTAGGTATTTAATTAGTAAGAGTTTATCTAACTATATTTTCATTAATCCTACAATTAAATTCTAATATTTGCAAATATTTTAAATTTCCCCCTAAATTTTTCTTTCTCTTTTTCTTTCCCTTTTTCCTCTCCTCTCTTTTCTTTTCTTTTTTCTTTCTCTCCCTTTCTTTTCTTTTCTCTCCTGGCTTCCTCCTCCCTTCCTCTCCTTCTCTTTCTCGGCTCTCCCTCTCTTTCTCTCGGCTCTCTCTCTTTCCCACCCGAGCGGTGGCGGCGGCGGACTCGAGGGAAACGGAGGGCGACTCATCGGCGGCGGCAGCGGCGATGACGGCGGCAGCGACGGCGGCGCACGGTGGCGCGAGAACGGCGGTGCGGGCACGGAAACGGTCGGCGCGGCAGCACGACGGCGGCGGTGCGGCGGCACTACGATGGCGGCGCGACGGCGTGGAGGAGTGGGTGGTGGAGACGGGGAGAAGGGGGGGGGAATAGTGAGGCTTTTATAGGGGAAGGGAGAGAGATAAGGGGGAAGGAGGAGGAGGGAAGAGGGAAGAGGGGAAGAGGGGAAGAAGAAGAGGGGAAAGAGAGGAGGGGGAGGGGCGGCGACGCGGCTGCGGTGACGGGATGGCGGCGCGGGGCTCGGCGCGCGGCGGGACGCGAGACGCGACGGCGACGGATGAGCGGCGACGGGACGGCGACGCGACGGGCGACGGCGCGCGGCGATGGCGACGAGCGGGCGGCGCGGCGCGGGGCTCGGAGCGGCTCGGCGCGGGAGGGGACGGCGACGCGACGGGCGGTGACGGGGTGCGTGGGCGCGCGGGGCGAGGTGGCAGGCGGCGATGGGACGGCGACGCGACACGGGACGGCGACGCGACGGGCGGTGACGGGGTGCGTGGGCGCGCGGGGCGAGGTGGCAGGCGGCGATGGGACGGCGACGCGACACGGGACGGCGACGCGACGGCGACGGTGATGCGACGGCGAGCGACGCGGCAGAGGGGAGGCGGAGCGGCGCGCGTGGCAGGGGAGGGGAAAGGCTGGGGACCAGGTCGAACACGTGGCGGGCAATGAACAGTGCACTTTTCCAATAAACCGATTTTAGAGTGTTTCTCATATGAATTTGATTCCGAAATTCTTAATTTTTTGCATAAATGAAGTTTTACCCCATATTTATATTATTCTAACTAAAGATTCACCTAATTTAATATCACTCATATTTTGTTTATATAATTCATTTGAATTTTTAATTAAAGTTAATTCTCATTCCATCGTATTAAAATTTAATTGTTGTTAATATGGTTGCGATAACATTTTATTTATTTCCAAACCCACCTAATCTTTATTTTAATTTATATTTTAATTATTTATTTAGCCCACTTGATTTTTAGGGTTTATTCCTAGTTAATTTCCTCCCATTTGTGATCGATGAAATCCGAAATCAAAATCCAATAAAATCTTCGAATAAAATTGGCATGATGCAATTTATTTAAAAAGTTTTTTTTTTTTTGAAGATCAGAATTTTTTTGGAGTCTTTGATTTTGTTGGTCGAATTTTCAGAATGTTACA:Chr10_15693-Chr10_15693 1/1:NA:3829:0,0:+-:57:INS:Sniffles2.INS.0S9:N:GTTAATGTGGACCCCCGTTTTTATAACAGGAATCAATCGTGTAAACAGTACCATTTCCCTGGATCAAGTAGTTTGGTACACACGAATTCATAGCTGGAATATCAAGTGCACATACACGAGTGTTTAATACGAATTATTACATCATAGGCCCGCAGGCATAGCCTTAAATCATCGGACCGAGAGGTCCGGAATACATCCAAAAGCAGAAATAGATAAGGCAGCGGAATTCAGGCAGCGGCATGCCATTGCCACAGGCAACGACCGTGCTAAGACCTACTGGACGCCATCGTCTTCATCTCCTTCCTGATAATAGGCATAGGGATCCTCCGAGGCTTCATCTCCCACTTCTGATAAATTTTATATTTGCAAGGATGAGTACCAACCGTACTCAGCAAGCCACCACAGCAACAATGCATATGAAAGGGGGAGTTCAAAGGATGGCCATAGTTCTTTTGCGCAAAGCAAGTTTTGTAATTCTTTTCACAAGCCTAAGACCTAGCATTGACTGATCAAATTTTTAGTACCAGAGTTTGTATTTAAACAACGACGGTTCTGTCCACCATCCATTGTGATCCCAAAGCTTCCCGCCATTGATTCGTCATGGTTTTCTGAGGACGTCCACCTTCCCGCCTCTCAGGAAGTGGCTCCAACAGCATAAAATTCATCATGCAATATCCCATCCCACACAAGTTAAGAATTTAGAGTCTAGCCAAGTGTAATACATGTCCCGGTGCTCAATAACCGCGAGCACGGCTATTCGAATAGGTTTGGTTTACTCACACTGCAGTGGATGTACACTTTACCCGCACTCCGCGACTGCCCAACACATGAGCCTCGTCCCAACACATGAGACGCGTCACGGCAAAGCTTTTCGATAACCTCGCATTGGCAGTACCCGCTCCAGGGAACTTTTCATCCTCATGCACTCTAGGAATACACGGTTTCTAGCAGTGAGAGGAGTTCTGGCGCACCCGGGAAGGGAAGACTCACACATGCATTAAGTTATAATTATGTTTTAGATTCTCACATGGCAGTCCTACCGATGGCGACACCACTGTAGACACCCTCCTCGCGGTCCTACCAATGGCTGCCCCACCGTAGAGCCCCTGCCTCACACATCAAGAAACCACTATGCATGGATACTGCCTCCGCTCAGCTATCTACTCCGCTAGGTCTATACCCATACGAGAAGTGCGGTTGTACGGGGGTCGTTTCATGCTTAACCTCATGGCTCGGTCCTTAATTGACCAGGGACGGCACTAGCCTTTTCCGGACACCACCCAAGTCCTCCAGCCGCCCCAGTCGAAAACAGTTGTTTTACTTTATTTTCCTTTCACAAATTATGTCATCAATATCATGGCAATGTGGCGCTCATGTCTCCACATGCCGCATCTCAATTACCTTCCCAAAGGTAATTGCCCAAGCATATAGCATTTGATAAATATGAGTATGCATGAATCTAAAATAGCATTTCTAAGCAAGTGTCATAGTTGACTAGGGACTCGTACGTATCCATGGTTACAAAGATTTAAAGGTGAACAATAATCAAGGCATGGCATAATCACAAGTAGGAGGTTCATAATTGCATGCAATTTTATTTATAAACAAAAGAATTTCGCAATTGGGATCAACATGTTCAAGGAATAGTGATGACTTGCCTTGCTCGAGGTCTTGCGGGTCTTGGCCTTCACCTGGATCCGCGGCTCCCTCGGTCTCTATAGTTACGTGCGAAAATTGATTTGAATTCGGTTAGAATTCAAATAAAAATCCAAGTAAATCCGAATGGAAGTCGAACGCGAAAGTCAATTCCTTTTTATTAATTTTACTATCCGCGAACTATGGCAAACCCCAATTTTGGTTTAAATTATTTTGCGGTTACAATATTTTGTTGTCGTATGTTTTTAATTTAATCTACCCTAGCATTATATCCATATATTAGGTTAGAAATTTCTTATCGCGAGCTAAATGTCGGGCGGAATCCTAAAATTATCTTATAATATTATACGACTTAATTTAGTCTATGATTAAATATAATACACGGTTAACACCCTAGTAATTAAATCCGAATCGCTACCGTTGATCGATTATTTATAGAGATTACCCAGAAATAATCCACAATAATTTACGAGAATTCATACATTTGTTTAATTATTAATTACATCTAAATTAATACCGCGAATTGATTTCTTATGGAGAGTACTAAAAGTTATCTATAATCTATAGGAATTTATCCCATTATATTTTCATTAATCCTATATTTAAATGCTAGTATTTTAGGTATTTAATTAGTAAGAGTTTATCTAACTATATTTTCATTAATCCTACAATTAAATTCTAATATTTGCAAATATTTTAAATTTCCCCCTAAATTTTTCTTTCTCTTTTTCTTTCCCTTTTTCCTCTCCTCTCTTTTCTTTCTTTTTTCTTTCTCTCCCTTTCTTTTCTTTTCTCTCCTGGCTTCCTCCTCCCTTCCTCTCCTTCTCTTTCTCGGCTCTCCCTCTCTTTCTCTCGGCTCTCTCTCTTTCCCACCCGAGCGGTGGCGGCGGCGGACTCGAGGGAAACGGAGGGCGACTCATCGGCGGCGGCAGCGGCGATGACGGCGGCAGCGACGGCGGCGCACGGTGGCGCGAGAACGGCGGTGCGGGCACGGAAACGGTCGGCGCGGCAGCACGACGGCGGCGGTGCGGCGGCACTACGATGGCGGCGCGACGGCGTGGAGGAGTGGGTGGTGGAGACGGGGAGAAGGGGGGGGAATAGTGAGGCTTTTATAGGGGAAGGGAGAGAGATAAGGGGGAAGGAGGAGGAGGGAAGAGGGAAGAGGGGAAGAGGGGAAGAAGAAGAGGGGAAAGAGAGGAGGGGGAGGGGCGGCGACGCGGCTGCGGTGACGGGATGGCGGCGCGGGGCTCGGCGCGCGGCGGGACGCGAGACGCGACGGCGACGGATGAGCGGCGACGGGACGGCGACGCGACGGGCGACGGCGCGCGGCGATGGCGACGAGCGGGCGGCGCGGCGCGGGGCTCGGAGCGGCTCGGCGCGGGAGGGGACGGCGACGCGACGGGCGGTGACGGGGTGCGTGGGCGCGCGGGGCGAGGTGGCAGGCGGCGATGGGACGGCGACGCGACACGGGACGGCGACGCGACGGGCGGTGACGGGGTGCGTGGGCGCGCGGGGCGAGGTGGCAGGCGGCGATGGGACGGCGACGCGACACGGGACGGCGACGCGACGGCGACGGTGATGCGACGGCGAGCGACGCGGCAGAGGGGAGGCGGAGCGGCGCGCGTGGCAGGGGAGGGGAAAGGCTGGGGACCAGGTCGAACACGTGGCGGGCAATGAACAGTGCACTTTTCCAATAAACCGATTTTAGAGTGTTTCTCATATGAATTTGATTCCGAAATTCTTAATTTTTTGCATAAATGAAGTTTTACCCCATATTTATATTATTCTAACTAAAGATTCACCTAATTTAATATCACTCATATTTTGTTTATATAATTCATTTGAATTTTTAATTAAAGTTAATTCTCATTCCATCGTATTAAAATTTAATTGTTGTTAATATGGTTGCGATAACATTTTATTTATTTCCAAACCCACCTAATCTTTATTTTAATTTATATTTTAATTATTTATTTAGCCCACTTGATTTTTAGGGTTTATTCCTAGTTAATTTCCTCCCATTTGTGATCGATGAAATCCGAAATCAAAATCCAATAAAATCTTCGAATAAAATTGGCATGATGCAATTTATTTAAAAAGTTTTTTTTTTTTTTGAAGATCAGAATTTTTTTGGAGTCTTTGATTTTGTTGGTCGAATTTTCAGAATGTTACA:Chr10_15694-Chr10_15694
Chr10 15694 svim.BND.21671 N N[Chr5:29772388[ 13 PASS SUPP=2;SUPP_VEC=1100;SVLEN=0;SVTYPE=TRA;SVMETHOD=SURVIVOR1.0.7;CHR2=Chr5;END=29772387;CIPOS=0,0;CIEND=0,1;STRANDS=++ GT:PSV:LN:DR:ST:QV:TY:ID:RAL:AAL:CO 0/1:NA:29756694:22,10:++:13:TRA:cuteSV.BND.1288:NA:NA:Chr10_15694-Chr5_29772388 ./.:NA:29756693:0,0:++:11:TRA:svim.BND.21671:NA:NA:Chr10_15694-Chr5_29772387 ./.:NaN:0:0,0:--:NaN:NaN:NaN:NAN:NAN:NAN ./.:NaN:0:0,0:--:NaN:NaN:NaN:NAN:NAN:NAN
Chr10 15699 svim.BND.21672 N N[Chr5:29768562[ 56 PASS SUPP=3;SUPP_VEC=1101;SVLEN=0;SVTYPE=TRA;SVMETHOD=SURVIVOR1.0.7;CHR2=Chr5;END=29768562;CIPOS=0,1;CIEND=-1,0;STRANDS=++ GT:PSV:LN:DR:ST:QV:TY:ID:RAL:AAL:CO 0/1:NA:29752862:22,10:++:13:TRA:cuteSV.BND.1289:NA:NA:Chr10_15700-Chr5_29768562 ./.:NA:29752863:0,0:++:8:TRA:svim.BND.21672:NA:NA:Chr10_15699-Chr5_29768562 ./.:NaN:0:0,0:--:NaN:NaN:NaN:NAN:NAN:NAN 0/1:NA:29752861:0,0:++:56:TRA:Sniffles2.BND.635S9:NA:NA:Chr10_15700-Chr5_29768561
truvari merged file: only the insertion remain, the two traslocation are discarded
Thanks for your work
Hello,
I am unable to recreate this issue. When I put the following input through truvari collapse -i file.vcf.gz -o out.vcf using v4.2.2-dev
##fileformat=VCFv4.2
##contig=<ID=Chr10,length=248956422,md5=6aef897c3d6ff0c78aff06ac189178dd>
##FILTER=<ID=PASS,Description="All filters passed">
##INFO=<ID=SUPP,Number=1,Type=String,Description="Variant ID">
##INFO=<ID=SUPP_VEC,Number=1,Type=String,Description="Variant ID">
##INFO=<ID=SVTYPE,Number=1,Type=String,Description="Variant type">
##INFO=<ID=SVLEN,Number=.,Type=Integer,Description="Variant length">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NA24385 2:NA24385 3:NA24385 four
Chr10 15693 pbsv.INS.9527 G GGTTAATGTGGACCCCCGTTTTT 133 PASS SUPP=3;SUPP_VEC=1011;SVLEN=3827;SVTYPE=INS GT 1/1 ./. 1/1 1/1
Chr10 15694 svim.BND.21671 N N[Chr5:29772388[ 13 PASS SUPP=2;SUPP_VEC=1100;SVLEN=0;SVTYPE=TRA GT 0/1 ./. ./. ./.
Chr10 15699 svim.BND.21672 N N[Chr5:29768562[ 56 PASS SUPP=3;SUPP_VEC=1101;SVLEN=0;SVTYPE=TRA GT 0/1 ./. ./. 0/1
I get the output
##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##contig=<ID=Chr10,length=248956422,md5=6aef897c3d6ff0c78aff06ac189178dd>
##INFO=<ID=SUPP,Number=1,Type=String,Description="Variant ID">
##INFO=<ID=SUPP_VEC,Number=1,Type=String,Description="Variant ID">
##INFO=<ID=SVTYPE,Number=1,Type=String,Description="Variant type">
##INFO=<ID=SVLEN,Number=.,Type=Integer,Description="Variant length">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##INFO=<ID=NumCollapsed,Number=1,Type=Integer,Description="Number of calls collapsed into this call by truvari">
##INFO=<ID=CollapseId,Number=1,Type=String,Description="Truvari uid to help tie output.vcf and output.collapsed.vcf entries together">
##INFO=<ID=NumConsolidated,Number=1,Type=Integer,Description="Number of samples consolidated into this call by truvari">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NA24385 2:NA24385 3:NA24385 four
Chr10 15693 pbsv.INS.9527 G GGTTAATGTGGACCCCCGTTTTT 133 PASS SUPP=3;SUPP_VEC=1011;SVLEN=3827;SVTYPE=INS GT 1/1 ./. 1/1 1/1
Chr10 15694 svim.BND.21671 N N[Chr5:29772388[ 13 PASS SUPP=2;SUPP_VEC=1100;SVLEN=0;SVTYPE=TRA GT 0/1 ./. ./. ./.
Chr10 15699 svim.BND.21672 N N[Chr5:29768562[ 56 PASS SUPP=3;SUPP_VEC=1101;SVLEN=0;SVTYPE=TRA GT 0/1 ./. ./. 0/1
Note that I had to remove most of the INFO/FORMAT fields to make this test file. So it is possible the INFO/FORMAT fields are somehow stopping the BNDs from being processed? Perhaps the same fields that were preventing bcftools from running on the input?
I'm trying to build a parser to simplify the SURVIVOR output in order to make it more similar to the one from bcftools (and your actual example). the problem of bcftools with merging with multiple callers (pbsv, svim, sniffles and cutesv) is that some fields are represented in different ways (and SURVIVOR is more elastic with that). I will replay again when I would have tried truvari with a simplified version. Even the one you added here. Thank you Adam for the help!
sounds good. Just know that SURVIVOR being elastic with the fields doesn't necessarily mean it is handling them properly. It doesn't have the most stringent VCF handling. It may be worthwhile to work on some pre-processing of the individual callers' results to make the headers and fields compatible before being fed into bcftools.