gatk icon indicating copy to clipboard operation
gatk copied to clipboard

Adds option to write sample names in variantCounts

Open odcambc opened this issue 1 year ago • 0 comments

In some cases it may be useful to know what reads are giving rise to which specific variants. I have run into several cases while debugging some strange results where this would be useful to know, and also there is a QC workflow we would like to implement where this would be essential information. This is unlikely to be generally useful, however.

This PR adds a flag, --write-qnames, which will, for each variant, write the list of qnames in the bam that give rise to that variant as a comma separated list in the final column.

This PR also makes synonymous variants (with no protein-level consequence) write an empty value rather than nothing, in order to keep column order.

This seems to work with SE reads, but hasn't been tested much with PE reads.

This should also probably not parse read names by default, but only if write-qnames is set.

odcambc avatar Jan 15 '24 22:01 odcambc