smoove icon indicating copy to clipboard operation
smoove copied to clipboard

Additional SV callers at merge step

Open Thatguy027 opened this issue 6 years ago • 2 comments

Hello there,

I just incorporated your tools into my SV calling pipeline and it is working great. I did run into a panic error that was a bit confusing, but just re-ran the pipeline and it worked fine. I saved the error then deleted it once the pipeline finished... sorry about that, if it comes up again I'll send it your way.

My question is regarding the merge step of the population calling pipeline. Can this step handle multiple SV callers? I am currently running Delly2, Manta, TIDDIT, and now smoove/LUMPY. My thought was to call sample variants with each of these callers, merge at the sample level to have one VCF per sample, then merge samples to generate a site list with smoove, then recall with smoove and Delly2 because these are the only ones with recall capabilities as far as I can tell.

I am not sure if this will add anything to the smoove/LUMPY or Delly2 calls due to the differences in SV detection among the software packages, but I'm curious to know your thoughts on this prior to attempting to incorporating it into my pipeline.

Best, Stefan

Thatguy027 avatar Apr 18 '18 18:04 Thatguy027

That won't work currently. I agree that would be useful. I would like to add support for another caller at some point, but I haven't decided which--I am open to suggestions. I don't think it would be feasible to support any caller, but maybe one that complements lumpy could be added.

brentp avatar Apr 18 '18 18:04 brentp

Great, that's what I figured.

From my simple simulations (take a reference genome, insert random sequences of defined ength and position, align reference fastq, call insertions) Manta did the best at recovering larger insertions (>100 bp). This may be a good tool to incorporate for the identification of large insertions... from what I can tell LUMPY/Delly2 doesn't call this class. Added bonus for Manta is that it returns assembled contigs in the INFO/CONTIG field for inserted sequences that can be useful for querying (mobile elements, HGT, etc).

Thanks for the quick response and great software!

Thatguy027 avatar Apr 18 '18 19:04 Thatguy027