Brent Pedersen
Brent Pedersen
the FORMAT is the one that duphold uses and yes, that is correct. However, you can't simply change the header, you'll have to adjust the values for every record as...
AD should have multiple values for each samples. for a bi-allelic variant, it should have 2 values, the first indicates the number of reads supporting the reference allele and the...
it won't give you correct results.
just use: ``` export DUPHOLD_SAMPLE_NAME=sqcudn971184.187147 ```
did those exit with 0? do you see: "[duphold] finished" in the stderr of those jobs? did you allocate enough time, memory for the job to finish on the nodes?...
is the error intermittent? or does it occur every time for these same cram files?
I doubt it's the memory, but I can't think of anything else. can you share the vcf+bam for one of the failing samples?
and you are running all samples with `-v adsp5k.scalpel.genes.indel.norm.ann.vcf.gz`?
instead of piping to bgzip, can you instead just use `-o $sample.vcf.gz`? that will be faster anyway and rule out piping issues.
I still don't know why/how this happened, but, if you weren't already, always run any script with pipes with the `set -o pipefail` option in bash to make sure you...