sarek
sarek copied to clipboard
Perform a compatibility check between bed file specified at --targets and reference genome
Description of feature
If the bed file is incompatible with the reference genome (i.e. specifies regions outside the reference), the pipeline fails only after the alignment step at base recalibration. It would be nice to have a process that compares e.g. the fasta index (or sequence dictionary) with the bed file, and throws an error if the bed file is incompatible.
I agree with you, and early fail would be better in that case
See also https://github.com/nf-core/sarek/issues/97
I think this would be a nice hackathon issue, it is fairly self-contained not too big. Do you know of any tool that can easily do this + possible other checks, like that it is sorted and so on? I briefly checked bedtools, but as far as I saw it is made for manipulating the bed file and not validating it in any form.