sarek icon indicating copy to clipboard operation
sarek copied to clipboard

Population calling | joint calling | ensemble calling

Open amizeranschi opened this issue 1 year ago • 7 comments

Description of feature

Hi!

I was trying out the joint_germline option, which produces a multi-sample VCF file from haplotypecaller. It would be helpful to have a similar feature for other callers like deepvariant, freebayes and strelka2.

It could also nice to have support for ensemble calling (from multiple callers), for some use cases where the number of samples is manageable.

One way to achieve this would be to use bcbio.variation.recall, which was created by Brad Chapman and used in bcbio-nextgen.

Relevant documention from bcbio-nextgen can be found here:

For joint calling, another solution could be to have the variant callers output gVCF files, and then use GATK's GenotypeGVCFs or something similar on those. There was a discussion about this on the deepvariant GitHub page a few years ago: https://github.com/google/deepvariant/issues/142#issuecomment-459440356.

amizeranschi avatar Nov 25 '22 12:11 amizeranschi

Past relevant feature request that I just noticed, but also ties into this: https://github.com/nf-core/sarek/issues/453

amizeranschi avatar Nov 25 '22 13:11 amizeranschi

  • [ ] https://github.com/nf-core/sarek/issues/1124
  • [ ] FreeBayes
  • [ ] Deepvariant
  • [ ] Strelka2

FriederikeHanssen avatar Jul 18 '23 16:07 FriederikeHanssen

Hi Thanks for this really useful tool. I used it recently but the joint calling option failed for freebayes. Have others reported this issue?

SofiaMAhmed avatar Jan 31 '24 11:01 SofiaMAhmed

hi! There is no joint calling currently implemented for freebayes.

FriederikeHanssen avatar Jan 31 '24 11:01 FriederikeHanssen

Thanks for the speedy response. Is this something that may be implemented in the near future? We are really keen to incorporate this pipeline for our routine work. My current work around is to use the sarek output bams then mark readgroups and run freebayes seperatley, which seemed like the best option, other than merging the outputted vcf's and recalling against the bams.
Cheers Soph

SofiaMAhmed avatar Jan 31 '24 11:01 SofiaMAhmed

I don't think we have resources at the moment on the developer side. If you'd like to take a stab at it, we would be very happy to guide you through though :)

FriederikeHanssen avatar Jan 31 '24 12:01 FriederikeHanssen

I was looking into this a while ago, but couldn't figure out any practical way to do it. The bcbio-variation-recall package I mentioned above has long been discontinued. Outside of that, there was GLNexus which looked promising, but that has become discontinued as well a couple of years ago and there are currently some serious issues like this that were reported but never solved.

amizeranschi avatar Jan 31 '24 13:01 amizeranschi