eager icon indicating copy to clipboard operation
eager copied to clipboard

DSL2: Genotyping on multiple snp sets in one run?

Open TCLamnidis opened this issue 1 year ago • 1 comments

It might be nice to be able to genotype on multiple SNP sets in a single run. I'm specifically thinking of pileupcaller here, not sure how it would apply to other genotypers, but:

Currently, the reference sheet takes one pileupcaller_{bed,snp} per reference. That means that if one wanted to genotype on two sets of positions, they would need to run the entire pipeline twice, or duplicate a row in the reference sheet just for that additional genotyping. Now, since the latter option will not fly with the ref-sheet validation, one would have to "fake" an entire new reference, thus duplicating all the processing, just for the extra genotypes.

Solution: Maybe we can turn the pipleupcaller_bed/snp columns into a list column, e.g. multiple files separated by ;, that would then get split into separate channel elements with the same meta, and thus only duplicate the genotyping step?

TCLamnidis avatar Jun 21 '24 08:06 TCLamnidis