cohorts icon indicating copy to clipboard operation
cohorts copied to clipboard

(eventually) prepare VCFs from sample BAMs

Open jburos opened this issue 8 years ago • 0 comments

This is an enhancement that would be nice to have, but is not technically required at this point.

This is in part to address the question (Does cohorts make VCFs from sample files?) which does come up from time to time.

We currently use epidisco to prepare input files including VCFs from patient tumor & normal BAMs. The easiest path to enable this will likely be tighter integration with epidisco.

Our current workflow is:

  1. Set up epidisco using a docker image, as described the official Running Epidisco With Ketrew/Coclobas tutorial
  2. Given sample bams, generate bash submit script
  3. Copy & paste submit script to submit to epidisco
  4. Monitor job status using ketrew UI; wait for results to be written to NFS server created by epidisco
  5. Mount nfs server on VM & set up Cohort, linking patients to results on NFS server

It's not clear whether all the output from epidisco is required for every analysis, but many of those outputs would be useful.

Seems like we would need:

  1. [ ] a way for cohorts to dispatch the job-requests to epidisco, either via API (epidisco-web?) or by sending the commands to a co-hosted docker container.
  2. [ ] a way for cohorts to link the resulting outputs to the Patients in the cohort, possibly tolerating delay in their arrival.

It is an open question whether this capability belongs in cohorts, or if it's better left outside of cohorts. The option of epidisco-web for submitting these pipelines makes the option of integration more attractive.

jburos avatar Oct 14 '16 16:10 jburos